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(54) TItie: METHOD FOR PRODUCING SECOND-GENERATION LIBRARY 

O (57) Abstract: The present invention relates to a method for generating a second-generation library. In a first step, a library of 
^ encoded molecules associated with an identifier nucleic acid comprising codons identifying chemical entities that havep^cipated in 
^ the fonnation of the encoded molecule is provided. In a second step, the libraiy is partitioned and encoded molecules having a certain 
Q property are selected. Codons of identifiers of selected encoded molecules aie subsequenUy identified, and a second-generation 
library is prepared using at least some of the chemical entities coded for by the identified codons. The new focussed libraiy may be 
^ used for another partition step to select encoded molecules with a certain property. 
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Title 

METHOD FOR PRODUCING SECOND-GENERATION LIBRARY 

Various patent and non-patent references cited in the present application are hereby 
5 incorporated by reference in their entirety. 

Technical Field of the Invention 

. The Invention relates to a method for producing a second-generation compound 
library with an improved desired property profile. Jn nature and artificial metiiods 

10 based on the natural system, the parent genotype is carried on to the off-spring and 
results in a phenotype iri which the exact type and sequence of amino acids is 
retained, unless a mutation and/or recombination has occurred. The present method 
only retains the identity of chemical enti'ties, e.g. amino acids, while tfie sequence 
wholly or partly is scrambled. The result is a focused second-generation library witti 

15 lower diversity. 

Background of the invention 

The biological evolution is based on the survival of specific genotypes tiiat encode 
phenotypes witii tiie most suitable functionalities in a certain environment In all liv- 

20 ing species DNA programs tiie genotype. DNA sen/es two important functions in tiie 
natural selection process. One function fe obviously to encode for the type of nucleo- 
tides used and the otiier function is to encode for tiie specific order of nucleotide 
sequences in a nucleic acid sequence. The strategy used in nature, i.e. encoding for 
tiie exact type as well as the precise sequence of nucleotides, ensures an extremely 

25 similarity between tiie progeny and its parents. Thus, conserving almost tiie exact 
sequence and type of tiie nucleotides is absolutely essential in order to create off 
spring with a high functionality. The changes In ttie genotype from one generation 
to anotiier. which allow for evolution, are detemiined by tiie random mutation rate 
and recombination between tfie two parent's genotypes. 

30 

The natural selection cannot afford too jnany changes in the DNA fro^ 
bon to the next in order to secure sun/ival of tfie species. Therefore, nature has 
evolved sophisticated means to proofread tiie copying of tiie DNA from tiie parents 
to its progeny and secured tiiat the characteristics of phenotype from one generation 
35 to the next is carried only by the DNA. 
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Within the art of selecting llgands from a library of encoded polypeptides associated 
with a corresponding identifier nucleic acid sequence, the method of nature is used. 
Thus, when more than a single library generation is needed, the identifier nucleic 
5 acid sequences (genotype) cam'es the infomnation from one generation to the .next 

WO 93/03172 A1 discloses a method for identifying a polypeptide ligand having a . 
desired property in a polypeptide library. In a first step, a translatable mRNA mixture 
is provided, which is mixed with a mixture of ribosome complexes to fonn a transla- 

1 0 tion product attached to the mRNA strand responsible for the formation thereof. In a 
second step the ribosome complexes binding to a target are partitioned from and 
remainder of tiie library, in a third step, an amplification of mRNA stinands of the par- 
titioned ribosome complexes, which has bound to tiie target follows. The amplified 
mRNA strands are used for the production of a second generation library, which is 

15 subjected to a renewed contact witii the target The method is repeated a sufficierit 
numl3er of times until the size of the library has narrowed to a small p>ool of high af- 
finity binders. 

In WO98/31700 A1 a method for selecting a DNA molecule, which encodes for a 
20 desired protein, is disclosed. The method implies the initial presence of a pool of 
car)didates RNA molecules, which subsequentiy is translated into a corresponding 
pool of Rl^-protein fusions. Subsequentiy the mRNA-protein fusion products are 
subjected to a selection process, i.e. the fusion products are presented for a target 
molecule, and a new pool of complexes capable of binding to the target are parti- 
25 tioned. From tiie new pool of complexes, the mRNAs are recovered and amplified 
for use in a subsequent round of library gerieration. Xu, L et al Chemistry & Biology, 
Vol. 9, 933-942, August 2002 discloses a pracb'cal embodiment in which a library of 
more tiian 10^^ unique mRNA-protein fusion products through ten rounds of library 
generation and selection are used to identify a high afiinity binding protein. 

30 

The prepjaration of libraries of syiithetic molecules associated witii a corresponding 
identifier nucleic acid sequence, and tiie selection of synthetic molecules from such 
libraries, have been the subject of various patent applications. When two or more 
generations of libraries are needed, the identifier nucleic acid sequence is used as 
35 tanier bietween an Initial library and the next generation library. 
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Thus, in WO 00/23458A1 libraries of complexes comprising non-natural molecules 
attached to con-esponding nucleic acid spn..ences are suggested. After a selection 
of the library has been conducted, the nucleic acid sequences of successftjl com- 
5 plexes are amplified by PGR and a new library is prepared from these nucleic acid 
sequences. The same method of canying infomnalion from an initial library to the 
next library is applied In WO 02/074929A2 and WO 02/103008A2. 

The present invention provides a new method for evolving encoded molecules. The 
1 0 method is based on the identification of cherpical entities used in the synthesis of 
reaction products of successful complexes and the application, at least in part, of 
these chemical entities in the preparation of the next generation library. The utiliza- 
tion of preferable chemical entities and tiie exclusion of certain undesired chemical 
entities in tiie next library generation generally imply tiiat the next generation library 
15 has a smaller size compared to tiie size of the initial Jibrary, thereby, at ttie same 
time, retaining the desirable encoded molecules in tiie library. 

Summary of the Invention 

The present invention concerns a method for producing a composition of molecules 
20 with an improved desired property, said method comprising the steps of. providing 
an initial library comprising a plurality of different encoded molecules associated with 
a con-esponding identifier nucleic acid sequence, wherein each encoded molecule 
comprises a reaction product of multiple chemical entities and ttie identifier nucleic 
acid sequence comprises codons identifying said chemical entities; subjecting the 
25 Initial library to a condib'on partitioning members having encoded molecules 
displaying a predetennined property from ttie remainder of the initial library; 
Identi^ng codons of the identifier nucleic acid sequences of the partitioned 
members of the initial library; and preparing a second-generation library of encoded 
molecules using tiie chemical entities coded for by tiie codons of tiie partitioned 
30 members of tiie initial library or a part thereof. 

The present invention relates to a novel approach to perfonn evolution of molecules 
with a desired property, said approach being different from tiie approach of nature 
and tiie prior art. The invention is based on the selecting of chemical entities, the 
35 counterpartof amino acids In Nature, insteaddf the precise sequence of chemical 
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entities. This new approach is powerful in ex vivo conditions when high functionarity 
of the off spring is not vital for su ceess^ r/N/^ien the numlser of chemical entities 
relative to tiie number of reactants used ir\ieach encoded molecule is high. 

5 The metiiod disclosed herein will be increasingly effective as the library size in- 
creases. This is due to tfie fact that more chemical entities is used when a library 
size is increased, when tiie number of reactions for the fonnation of the encoded is 
fixed and tiie fact tiiat different chemical entities tend to be Involved in encoded 
molecules having the desired property. The chemical entities, which are part of the 
1 0 final selected molecules, will be enriched in each round of selection, Rnally. when 
the diversity has been extensively reduced, the enriched molecules are decoded 
from the identifier nucleic add sequence comprising the codons of the chemical enti- 
ties that have participated in tiie formation of the encoded molecule. 

1 5 The strategy of performing enrichment of chemical entities instead of specific com- 
binations of chemical eritities more efRdentiy search the chemical space for all com- 
binations of chemical entities tiiat are eager to show a certain property, such as a 
binding ability towards a target Thus, chemical entities having a certain impact on 
the formation of encoded molecules is allowed in a new library to recombine in each 

20 new library generation. In a certain aspect of the Invention, the recombiriation is 
random, i.e. once a chemical entity has qualified as being of interest it is allowed in 
every position of tiie reaction sequence. In another aspect of the invention, the re- 
combination is semi-random, i.e. once a chemical entity is qualified as being of in- 
terest it is used in a certain position in the reaction sequence of tiie encoded mole- 

25 cule. In still a furtiier aspect of ttie invention, tiie amount of tiie chemical entity used 
in a subsequent library generation Is dependent on the frequency and tiie amount of 

tiie partitioned library members. 

♦' 

The present invention may be of special interest when a group of chemical entities 
30 are selected from a larger pool of chemical entities In tiie fonnation of a first library. 
Selecting chemical entities resulting In encoded molecules having a certain property 
in a first library and spiking Wrtii remaining chemical entities of the pool allows for the 
formation of a second-generation library not necessarily of a smaller size but en- 
riched in encoded molecules having a certain property. 
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The second-generation library may be formed of a reaction product of the chemical 
entities without attaching the reaction product to a nucleic acid. In an embodiment of. 
such second-generation library the individual reaction products are formed in dis- 
crete reaction compartments in accordance with traditional combi-chem technology. 
5 In a certain aspect of tiie invention, tiie second-generation library is prepared as tiie 
first generation library, I.e. the second-generation library comprises a plurality of 
' different encoded molecules associated with a corresponding identifier nucleic acid 
sequence, wherein each encoded molecule comprises a reaction product of multiple 
chemical entities and tiie identifier nucleic add sequence comprises codons identic- 
10 ing said chemical entities. 

In a prefen-ed aspect of tiie invention, it comprises subjecting the second-generation 
library to a condition partitioning members having encoded molecules displaying a 
predetermined property from the remainder of the second-generation library. The 
15 second-generation library may be partitioned as to the same property or a different 
property. Notably, tiie second-generation library can be screening against tiie same 
target or a different target 

After ttie partitioning of tiie second-generation library, tiie invention comprises the 
20 step of deducing tiie identity of the encoded molecule(s) usirig the Identifier nucleic 
acid sequence, when present. Optionally, a third or further generation library may be 
formed and screened before the final deducing step is perfonmed. In a certain em- 
bodiment, the decoding includes tiiat the codons of the identifier nucleic acid se- 
quence is decoded to establish the syntiiesis history of the encoded molecules. The 
25 synthesis history includes tiie identity of tiie chemical entities used and tiie point in 
time tiiey enter tiie sequence of reactions resulting In tiie encoded molecule. 

The encoded moleci^le is preferably a reaction product in which multiple chemical 
entity precursors have participated. The encoded molecule may have any chemical 

30 stmcture. Generally, the multiple chemical entftles are precursors for a stitictural unit 
appearing in tiie encoded molecule. However, tiie chemical entities may also 
perfomi a chemical reaction with the nascent encoded molecule, which result in an 
altering or removal of chemical groups. In certain aspects of the invention, tiie 
encoded molecule is a scaffolded molecule, ile. various chemical entities have 

35 reacted with a chemical core stiucture like steroid, benzodiazepine, retinol. 
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camphor, ephedrine, penicillin, cannabinol, coumarin, oxazol, etc. In certain other 
--aspectsi)f-thelriventioDJhe„encaded.mo.tecule fully or partly a polymer. ITie 
polymer may be of a type which occurs naturally or may be a non-naturally occurring 
polymer. Nature only has the possibility of preparing a-polypeptides using the 
5 recognition of a codon of an mRNA strand by the anticodon of a charged tRNA. In 
some aspects of the invention, the encoded molecule Is not a a-peptide. Notably, in 
some aspects of the invention, the chemical entities are reacted without enzymatic 
interaction to produce the encoded molecule. 

10 

The encoded molecule can be associated with the nucleic acid sequence identifier 
in any appropriate way. In a certain aspect of the invention, the encoded molecule 
associated with the corresponding identifier nucleic acid sequence is a bifunctional 
complex. The bifunctional complex may be fanned by covalent or non-covalent at- 
15 tachment of the encoded molecule to the identifier nucleic acid sequence. In another 
aspect of the invention, an identifier nucleic acid sequence is physically a distinct 
entity separated from the encoded molecule, wherein the identifier identifies the spa- 
tial position of an encoded molecule. e.g. in the same compartment In which an en- 
coded molecule is formed a conresponding identifier oligonucleotide is generated. 

20 

The conditions partitioning complexes of interests from ttie remainder of the library 
may be chosen from a variety of possibilities. In one aspect ttie condition relates to 
physical parameters, so that complexes displaying a physical stabiRly under e.g. 
certain temperature conditions, certain acidic conditions, certain radiation conditions 

25 etc. are selected from tiie library. In other aspects of the invention tfie condition for 
partitioning tfie desired complexes includes subjecting tiie initial library to a molecu- 
lar target and partitioning complexes binding to tills target The molecular target may 
be any compound of interest Exemplary targets are proteins, ca^hydiates. poly- 
saccharides, honmones. receptors, antibodies, viruses, ant^ens, cells, tissues etc. 

30 In certain aspects tiie target is immobilized on a solid support, such as column ma- 
terial and contacted with the candidate complexes in a fluid media followed by a 
partitioning of tiie complexes capable of binding to ttie target under the contacting 
conditions used. Typically tfie binding complexes are eluted from ttie column using 
Increased stringency conditions. 
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The complexes as such or only the Identifier part is harvested after the partitioning 
step. Usually the identifier nucleic acid sequences are amplified prior to the identifi- 
cation step. The amplification is suttably nerfdnmed applying polymerise chain reac- 
tion (PGR). The amplified identifiers may ^ explicWy or implicftly identified. When 
5 the codons are identified explicitly, the sequence and identity of nucleotides in the 
codon is made known to the experimenter, whereas, when the codons of the identi- 
fiers are impIiciUy identified, the experimenter is not presented for the information. 

Any suitable method for identifying codons may be used. In a certain aspect of the 
1 0 Invention, traditional sequencing, e.g. by using a modification of the Sangers method 
or pyrosequencing methods, identifies the codons. In another aspect of tfie inven- 
tion, tine codons of the identifier nucleic acid sequences of tiie partitioned members 
of the initial library are identified by contacting said identifier nucleic acid sequences 
with a pool of nucleic acid fragments under conditions allowing for hybridisation. 

15 

The pool of nucleic acid fi-agments may be immobilized or in solution. In a certain 
aspect of tiie invention, the pool of nucleic add fragments comprises a plurality of 
single stranded nucleic acid probes immobilized in discrete areas of a solid support, 
wherein tiie nucleic acid probes are capable of hybridising to a codon of the 
20 identifier nucleic acid sequence comprising codons. The nucleic cid probes may be 
positioned on a microan^y, such that the identity of tiie codons is revealed by 
observing the discrete areas of the support in which a hybridisation event has 
occurred. 

25 The nucleic add probe can be directiy hybridised to tiie identifier or the nucleic acid 
probe of the an-ay is hybridised to an identifier nucleic acid sequence through an 
adapter oligonucleotide having a sequence complementing the probe as well as one 
or more codons of the identifier nucleic acid sequence. The p?obe may Identify a 
single codon of an identifier or a probe of the array Is capable of hybridising to two 

30 codons of the identifier nucleic acid sequence or a sequence complementary to said 
sequence. The ability to hybridise two or more codons makes it possible to study tine 
influences of neighbouring chemical entities on each other. In a certain aspect, a 
nucleic acid probe of the aaay is capable of hybridising to all codons of an identifier 
nucleic acid sequence. This latter option wffl fully decode the identity of the encoded 
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molecule. Usually however, a fully decoding is only possible for a relative small 
library size, as it presupposes a nucleic acid probe for each member of the library. : 

When single codons are detected, useful infpnnation about a certain codon may be 
5 gathered by detecting the codon together with a framing sequence identifying the 
position in the reaction history of the chemical entity corresponding to said codon. 

As an example, if a library of complexes is prepared from 1 00 chemical entities and 
the three reactions, i.e. each identifier comprises 4 codons. the library size is 10°. 

10 For most practical uses 1 0^ is in the excess of what is possible to detect on an 
an^y, especially rf multiple detemiinations for each Identifier are considered 
necessary to obtain a high accuracy. However, an an-ay of just 1 00 probes 
complementary to the 100 codons will reveal important information prior to or 
subsequent to a selection. In the event a framing sequence is detected together with 

15 the codon an array of 400 probes is needed. 

A suitable method for identifying an hybridisation event is to use a label. Therefore, 
in a prefen-ed embodiment, the existence of a hybridisation event Is measured 
through labelling of the identifier nucleic acid sequence, or an amplification product 
20 thereof. When the label emits light, the hybridisation event is measured by the 
emission of light in a scanner. To reveal the relative abundance of each chemical 
entity in the library of encoded molecules, the relative intensity of light in each 
discrete spot is measured, 

25 The measurement of a hybridisation everrt may be coriducted by various methods 
known in the art. In the event the label emits fights, the presence or absence of a 
hybridisatfon event may be measured in a scanner, e.g. a confocal scanner. The 
scanner may be connected with computer software, which is able to quantify the 
. amount of lights measured. The amount of light measured correlates with the 

30 amount of identifier annealed to the probes. Thus, it is possible to measure not only 
the presence or absence of one or more codons of an identifier ; it is also possible to 
measure the relative amount of the codons in one or more identifiers. 

After the complexes have been partitioned and the specific codons have been iden- 
35 tified on the microarray, the infomiation can be used to design optimized libraries 
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including chemical entities based on both the selection data and the chemical struc- 
ture. The microarray analysis will first of all detect which chemical entities pass the 
■~pat^toniiTgTfep7Secd^ 1. .>.\sity on the microarray will reflect the 

relative binding affinity of the chemical endties. Finally, the stmctures of the chemi- 
5 cal entities are directly identified due to the position of the probes on the array. For 
instance, chemical entities that are strongly selected in a partitioning process but . 
possess some unfavourable chemical structure can be excluded in the next genera- 
tion of library. Similarly, chemical entities that are weekly selected in a partitioning 
process but possess some favourable chemical structure can be included in the next 
10 generation of library. Thus, the next generation library design can be based both on 
a rational choice of chemical entities with lead-like structures and the selection pres- 
sure detected on the microarray. 

Another method of identifying codons includes that nucleic acid fragments are 
15 primer oligonucleotides, and the identification involves subjecting the hybridisation 
complex between the primer oligonucleotides and the identifier nuciek; acid 
sequences to a condition allowing for an extension reaction to occur when the 
primer is sufficient complementary to a part of the identifier nucleic acid sequence, 
and evaluating based on measurement of the extension reaction, the presence, 
20 absence, or relative abundance of one or more codons. 

The extension reaction requires a primer, a polymerase as well as a collection of 
deoxyribonucleotide tiiphosphates (abbreviated dNTP*s herein) to proceed. An ex- 
tension product may be obtained In tiie event the primer is suffident complementary 

25 to an identifier oligonucleotide for a polymerase to recognise the double hefix as a 
substrate. After binding of tiie polymerase to ttie double helbc, the deoxyribonucleo- 
tide triphosphates (blend of dATP, dCTP. dGTP, and dTTP) are incorporated into 
the extension product using the identifier oligonucleotide as identifier. The conditions 
allowing for tiie extension reaction to occur usualjy includes a suitable buffer. The 

30 buffer may be any aqueous or organic solvent or mbcture of solvents in which the 
polymerase has a suffident activity. To facilitate tiie extension process tfie pdy- 
merase and tfie mixture of dNTP's are generally included in a buffer which is added 
to tiie identifier oligonucleotide and primer mixture. An exemplary kit comprising the 
polymerase and the nNTP's for performing the extension process comprises tfie 

35 following: 50 mM KCl; 1 0 mM Tris-HCl at pH 813; 1 .5 mM MgCG ; 0.001% (wt/vol) 
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gelatin, 200 pM dATP; 200 pM dTTTP; 200 pM dCTP; 200 pM dGTP; and 2.5 units 
-TTiennus aquaticus (Taq) DNA polymerase. I (U.S. Pat. No. 4,689,818) per 100 mi- 
croliters (pi) of buffer. , y 

5 The primer may be selected to be complementary to one or more codons or parts of 
such codons. The length of the primers may be detemiined by the length of the 
codons, however, the primers usually are at least about -11 nucleotides In length, 
more prefenred at least 1 5 nucleotides in length to allow for an efficient extension t)y 
the polymerase. The presence or absence of one or more codons is indicated by the 
10 presence of or absence of an extension product. The extension product may be 
measured by any suitable method, such as size firactioning on an agarose gel and 
staining with ethidlum bromide. 

In a preferred embodiment the admixture of identifier oligonucleotide and primer is 
1 6 termocycled to obtain a sufficient number of copies of the extension product The 
themnocycling ts typically canried out by repeatedly increasing and decreasing the 
temperature of the mixture within a temperature range whose lower limit is about 30 
degrees Celsius (SO'C) to about SS'C and whose upper limit is about 90'C to about 
100" C. The increasing and decreasing can be continuous, but is preferably phasic 
20 with time periods of relative temperature stability at each of ternperatures favouring 
polynucleotide synthesis, denaturation and hybridization. 

When a single complex is analysed in accordance with the present method, the re- 
sult may be used to verify the presence or absence of a specific chemical entity dur- 

25 ing the fonmation of the display molecule. The fonmatlon of an extension product is 
indicative of the presence of an oligonucleotide part complementary to the primer in 
the identifier oligonucleotide. Conversely, the absence of an extension product Is 
indicative of the absence of an oligonucleotide part complementary to the primer In 
the identifier oligonucleotide. Selecting the sequence of the primer such that it Is 

30 complementary to one or more codons will therefore provide Information of the 
structure of the encoded molecule coded for by this codon(s). . 

In a prefen-ed aspect of the invention, in the mixture of the identifier oligonucleotide 
and the primer oligonucleotide, a second primer complementary to a sequence of 
35 the extension product is included. The second primer lis also termed reverse primer 
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and ensures an exponential increase of the number of produced extension products. 
The method using a forward and reverse primer is well known to skilled person in 
the art and is generally refened to as polymerase chain reaction (abbreviated PGR) 
in the present application with claims. In one embodiment of the invention the re- 
5 verse primer is annealed to a part of the extension product downstream, i.e. near 
the 3'end of the extension product, or a part complementing the coding part of the 
identifier oligonucleotide. In another embodiment, the first primer (fonward primer) 
anneals to an upstream position of the identifier oligonucleotide, preferably before 
the coding part, and the reverse primer anneals to a sequence of the extension 
1 0 product complementing one or more codons or parts thereof. 

The amplicons resulting from the PGR process may be stained during or following 
the reaction to ease the detection. A staining after the PGR process may be pre- 
pared with e.g. ethidium bromide or a similar staining agent. As an example, ampli- 

15 cons from the PGR process is run on an agarose gel and subsequently stained with 
ethidium bromide. Under UV illuminatipn bands of amplicons becomes visible. It is 
possible to incorporate the staining agent in the agarose gel or to allow a solution of 
the staining agent to migrate through the gel. The amplicons may. also be stained 
during the PGR process by an intercalating agent, like CYBR. In presence of tfie 

20 Intercalating agent while the amplification proceeds it will incorporate In the double 
helbc The intercalation agent may then be made visible by irradiation by a suitable 
source. 

The intensity of tiie staining is infomiative of tiie relative abundance of a specific 
25 amplicoh. Thus, it is possible to quantify tfie occun-ence of a codon in an identifier 
oligonucleotide. When a library of bifunctional complexes has been subjected to a 
selection tiie codons in tiie pool of identifier oligonucleotides which has been se- 
lected can be quantified using this metiiod. As an example a sample of the selected 
identifier oligonucleotides is subjected to various PGR amplifications with diflerent 
30 primers in separate compartments and the PGR product of each compartment is 
analysed by electrophoresis in the presence of ethidium bromide. The bands tfiat 
appear can be quantified by a densitometric analysis after irradiation by ultraviolet 
light and tiie relative abundance of the codons can be measured. 
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Alternatively, the primers may be labelled with a suitable small molecule, like biotin 
— or-digoxlgenin, A PCR-ELISA. analysis jnay-^subsequenily be. perfomried based on 
the amplicons comprising the small molecule. A preferred method Includes the ap- 
plication of a solid support covered with streptavidin or avidin when biotin is used as 
5 label and anti-dlgoxigenin when digoxigenin is used as the label. Once captured, the 
amplicons can be detected using an enzyme-labelled avidin or antirdixigenin re- 
porter molecule similar to a standard ELISA format 

To avoid laborious post-PCR handling steps required to evaluate the amplicons, it is 
10 in a certain embodiment preferred to measure the extension process "real time'. 
Several real time PGR processes has been developed and all the suitable real time 
PGR process available to the skilled person in the art can be used in the evaluating 
step of the present invention and are include in the present scope of protection. The 
PGR reactions discussed below are of particular interest 

15 

The monitoring of accumulating amplicons in real time has been made possible by 
labelling of primers, probes, or amplicons witii fluorogenic molecules. The real time 
PGR amplification is usually performed with a speed faster than ttie conventional 
PGR, mainly due to reduced cycles time and the use of sensrtive methods for detec- 

20 tipn of emissions from the fluorogenic labels. The most commonly used fluorogenic 
oligoprobes rely upon fluorescent resonance eneigy transfer (FRET) between 
fluorogenic labels or between one flourophor and a dari< or "black-hole" non- 
fluorescent quencher (NFQ), which disperse energy as heat rather than fluores- 
cence. FRET is a spectroscopic process by which energy is passed between mole- 

25 cules separated by 1 0-1 00 A that have overiapping emission and absorption spec- 
tra. An advantage of many real time PGR methods is that they cari be cam'ed out in 
a closed system, i.e. a system which does not need to be opened to examine the 
re^iult of the PGR. A closed system implies a reduced result turnaround, minimisa- 
tion of tiie potential for cfany^ver contamination and the ability to closely scrutinise 

30 the essay's performance. 

The real time PGR metiiods cunrentiy available to the skilled person can be classi- 
fied into either amplicon sequence specific or non-specific methods. The basis for 
tfie non-specific detection methods is a DNA-binding fluorogenic molecule. Included 
35 in this class are the eariiest and simplest approaches to real time PGR: Ethidium 
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bromide, YO-PRO-1. and SYBR® green 1 all fluorescence when associated vwth 
double stranded DMA which is exposed to a suitable wavelencr- of light. This ap- 
proach requires the fluorescent agent to '-resent during the PCR process and 

provides for a real time detection of the fluprrescent agent as it s incorporated into 
5 the double stranded helix. 

The amplicons sequence specific methods includes, but are not fmited to. the 
TaqMan* hairpin, LightCycler*, Sunrise®, and Scorpion* methccs. The UghtCycter* 
method also designated -HybProbes" make use of a pair of a^^oent. fluorogenic 

10 hybridisation oligonucleotide probes. A first, usually the upstream oiigoprobe is la- 
belled with a 3' donor fluorophore and the second, usually the doWnstream probe is 
commonly labelled with either a Light cycler Red 640 or Red 705 acceptor fluoro- 
phore athe 5' tenninus so that when both dligoprobes are hyboased the two fluoro- 
phores are located in close proximity, such as within 10 nm. of each other. The 

15 close proximity provides for the emission of a fluorescence wtien irradiated with a 
suitable light source, such a blue diode in case of the UghtCycfe;*- The region for 
annealing of the probes may be any suitable position that does r^ot interfere with the 
primer annealing. In a suitable setup, the site for binding the prcfces are positioned 
downstream of the codon region on the identifier oligonucleotide. Altematively, when 

20 a reverse primer is used, the region for annealing the probes msy be at the 3' erxi of 
the strand complementing the identifier oligonudeotide. Anotha- embodiment of the 
LightCycler method includes that the pair of oligonudeotide probes are annealed to 
one or more codons and prkner sites exterior to the coding part of the identifier o6- 
gonudeotide are used for PGR amplification. 

25 

The TaqMan* method, also referred to as the 5' nuclease or hysaTDlysis method, 
requires an oiigoprobe. which is attached to a reporter fkxjrophcr, such as 6- 
r carboxy-fluoroscein. and a quencher fluorophore. such as 6-cactoxy-tetramethy^ 
/ rhodamine, at each end. When in dose proximity. i,e. annealed to an identifier oB- 
30 gonudeotide. or a sequence complementing the identifier o«gooudeotlde, the 
quencher will "hyack' the emissions that have resulted from the excitation of the 
reporter. As the polymerase progresses along the relevant strmd, it displaces and 
the hydrolyses the oiigoprobe via its 5'-*3' endonudease activtor. Once the reporter 
is removed from the extinguishing influence of the quencher, it b able to release 
35 exdlation energy at a wavelength that can be monitored by a suitable instrument 
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such as ABI Prism* 7700. The fractional cycle number at which the reaJ-time fluo- 
rescence signal mirrors progressipjioLth^ r-aaction above the background noise Is 
nomrially used as an Indicator of successful Identifier oligonucleotide amplification. 
This threshold cycle. (Cr) is defined as the PGR cycle in which the gain in fluores- 
5 cence generated by the accumulating amplicons exceeds 10 standard deviations of 
the mean base line fluorescence. The Cr is prop)ortioniaI to the number of Identifier 
oligonucleotide copies present in the sample. The TaqMan probe is usually de- 
signed to hybridise at a position downstream of a primer binding site, be it a fonward 
or a reverse primer. When the primer is designed to anneal to one or more codons 
10 of the Identifier oligonucleotide, the presence of these one or more codons is indi- 
cated by the emitlance of light Furthermore, the quantity of the Identifier oligonu- 
cleotides comprising the one or more codons may be measured by the d value. 

The Hairpin method involves an oligoprobe, in which a fluorophore and a quencher 

15 are positioned at the tennini. The labels are hold in dose proximity by distal stem 
regions of homologous base pairing deliberately designed to create a hairpin struc- 
ture which result in quenching either by FRET or a direct energy transfer by a colli- 
sional mechanism due to the intimate proximity of the labels. When direct energy 
transfer by a collision mechanism is used the quencher is usually different from the 

20 FRET mechanism, and Is suitably 4-(4'-dimethylamino-phenylazo)-benzene (DAB- 
CYL). In the presence of a complementary sequence, usually downstream of a 
primer, or witiiin the bounds of the primer binding sides in case of more than one a 
single primer, the oligoprobe will hybridise, shifting into an open configuration. The 
fluorophore is now spatially removed fixmi the quencher's influence and fluores- 

25 cence emissions are monitored during each cyde. In a certain aspect, the hairpin 
probe may be designed to anneal to a codon in order to detect this codon If present 
on the identifier oligonucleotide. This embodiment may t)e suitable if codons only 
differs from each otiier witii a single or a few nucleotides, because is In well-known 
that the occurrence of a mismatch between a hairpin oligoprok>e and its target se- 

30 quence has a greater destabiTising effect on the duplex than the infroduction of an 
equivalent mismatch between the targiet oligonucleotide and a linear oligoprobe. 
This is probably because the hairpin structure provides a highly stable alternate con- 
fomnation. 
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The Sunrise and Scorpion methods are similar in concept to the hairpin oligoprobe, 
except that the label becomes irreversible incorporated in to the PGR product The 
Sunrise method involves a primer (commercially available as Amplifluor™ hairpin 
primers) comprising a 5' fluorophore and a quencher, e.g. DABCYL. The labels are 

5 separated by complementary stretches of sequence that create a stem when the 
sunrise primer is closed. At the 3' temiinus is a target specific primer sequence. In a 
preferred embodiment the target sequence is a codon. optionally more codons. The 
sunrise primer's sequence is intended to be duplicated by the nascent complemen- 
tary stand and Jn this way. the stem Is destabilised, the two fluorophores are held 

10 apart, usually between 15 and 25 nucleotides, and the fluorophore is free to emit Hs 
excitation energy for monitoring. The Scorpion primer resembles the sunrise primer, 
but derivate in having a moiety that blocks duplication on the signalling portion of the 
scorpion primer. The blocking moiety is typically hexethylene glycol. In addition to 
the difference in structure, the function of the scorpion primers differs sfightly in that 

1 5 the 6' region of the oligonucleotide is designed to hybridise to a complementary re- 
gion within the amplicons. In a certain embodiment the complementary region is a 
codon on the identifier oligonucleotide. The hybridisation forces the labels apart dis- 
rupting the hairpin and pennitting emission in the same way as the hairpin probes. 

20 After tiie selection has been perfonned the codon profile is indicative of the chemical 
entities that have been used in tiie syntinesis of encoded molecules having a certain 
property, such as an affinity towards a target In the event the selection has been 
sufficient effective it may be possible directfy to deduce a part or the entire structure 
of encoded molecules with the desired property. Alternatively, it may be possible to 

25 deduce a structural unit appearing more frequentiy among the encoded molecules 
after tfie selection, which gives important infomiation to the staicture-activity- 
relationship (SAR). If the selection process has not nanrowed the size of the library 
to a manageable number, the fomiation of a second-generation library Is useful. In 
the fomiation of the second^eneration library chemical entities, which have not 

30 been Involved in the syntiiesis of encoded molecules that have been successful in 
the selection may be omitted, ttius limiting tiie size of the new library and at the 
same time increasing tiie concentration of complexes witfi the requested property, 
e.g. the ability to bind to a target The second^eneration library may ttien be sub- 
jected to more stringent selection conditions to allow only tiie encoded molecules 

35 witii a higher affinity to bind to ttie target The second-generation library may also be 
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generated using the chemical entities coded for in addition to certain chemical enti- 
ties suspected of increasing the perforrn^^nce of the final encoded molecule. The 
indication of certain succe^ssful chemical ^titles may be obtained from the SAR. 
The use in a second-generation library of chemical entities, which have proved to be 
5 interesting for further investigation in a preceding library, may tiius entail a shufHing 
with new chemical entities that may focus the second-generation library In a certain 
desired direction. 

An Example of implicit identification of codons includes that the nucleic acid frag- 
1 0 ment is associated wttii a chemical entity precursor capable of being transferred to a 
recipient reactive group. The recipient reactive group may be a part of a chemical 
scaffold and the chemical entity precursor may add a structural unit to said scaffold. 
It is prefen-ed that the nucleic acid fragment codes for the chemical entity. In some 
aspects of the present invention each member of the nucleic acid fragment pool 
15 comprises an anticodon, which identifies the chemical entr^. When a plurality of 
chemical entities are present the anticodon is preferably unique, i.e. a unique corre- 
spondence between tiie chemical entities and the associated anticodons exists. 

The identifier nucleic acid sequence comprises codons, which may be able to pair 
20 wiUi one or more anticodons of the pool of nucleic acid fragnients. The pairing be- 
tween one or more codons of an identifier nucleic acid sequence and one or more 
anticodons is preferably specific, i.e. the one or more codons of the identifier nucleic 
acid sequence are only recognized by particular anticodons. The nucleic acid frag- 
ment containing more than one anticodon can encode for scaffold molecules where 
. 25 each anticodon encodes for specific chemical entities of that scaffold molecule. The 
specific pairing makes it possible Implicitiy to decode the codon of an identifier nu- 
cleic add sequence. In the method according to the invention, non-specific pairing 
between codons an>;l anticodons can be cleaved witti an enzyme or chemically 
treated to break tiie^double stranded nucleotides. The non-pairing region can be 
30 cleaved using enzymes that cleaves specifically nucleotide sequences witii mis- 
matches. Notably, the enzyme is selected from T4 endonuclease VII, T4 endonucle- 
ase 1. CEL I, nuclease SI, or variants thereof. The cleavage is preferable used when 
more than one codon and anticodon is involved in pairing t>etween the identifier nu- 
cleic acid sequence and tiie riudeic acid fragment 



35 
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The pool of nucleic acid fragments associated with a chemical entity may comprise 
anticodons co mplemented by codons of one or more identifier nucleic acid se- 
quence as well aV anticodons which are not complemented by codons on any identi- 
fier nucleic acid sequence. In other words, the amount of genetic infomiaflon con- 
.5 tained irt the anticodons of the pool is larger than the amount of genetic infomnation 
complemented by the codons. 

TTie contading of the one or more identifier nucleic acid sequences with the pool of 
nucleic acid fragments are usually conducted at conditions, which allow for hybridi- 

10 sation. i,e. conditions at which cognate nucleic add sequences can anneal to each 
other. To facilitate the recovery of nucleic add fragments, which have annealed to 
the identifiernucteic addsequences. the identifier nucleic acid sequences are usu- 
afly immobilized on a solid support. Examples of suitable solid supports indude 
beads and column material, e.g. beads and column material associated with a sec- 

15 ond part of the affinity pair to bind Identifier nudeie add sequences attadied to the 
first part of the molecular affinity pair. In certain aspects of the invention the soBd 
support is associated v«th streptavldin and the identifier nudeic acid sequences are 
attadiedto biofin. 

20 When the identifier nudeic an add sequences are immobilized on a soHd support 
the pool of nudeic add fragments is typically present in a mobile phase. i.e. dis- 
solved in a liquid. The identifier nudeic acids will hybridise to these nucleic add 
fragments in the pool which are suffident complementary to a particular part of an 
identifier nudeic add sequence for a binding to occur. Fragments not finding any 

25 complementing sequence win remain In the sdution. In the event, the identiner nu- 
deic add sequences are segregated into codons and the fragments comprises anti- 
codons. the anticodons whidi are able to anneal to a codons will be caught while 
fragments not having a cojnate codon will be maintained in the mobile phase. When 
codons and anticodons are present In the method of the present invention, spedfic 

30 hybridisation implies that the tendency of an anticodon to cross4iybridlse to another 
codon wiU be impede or avdded. To avoid cross-hybridisation, codons may be de- 

s^ned such that eadi codon is distinguished from all other codons be one. two or 

more mismatching nudeotides. 



wo 2004/074429 PCT/DK2004/qOOH7 

18 

The mobile phase is subsequently separated from the solid phase e.g. by washing, 
and the enriched pool of fragments is recovered. The recovery of the nucleic acid 
fragments are usually done by subjecting the hybrid to denaturing conditions, i.e. 
conditions which separate the two strands. If the parent nucleic acid sequences are 
6 immobilized on beads, the separation of the fragments can be effected using dena- 
turing conditions and centrifugation/spinning. 

The enriched pool of nucleic acid fragments associated with a chemical entity may 
be used directly to prepare a next generation library of complexes, in which each 

10 member of the library comprises an encoded molecule and the nucleic add se- 
quence which codes for this molecule. In one embodiment of the invention, building 
blocks comprising a particular transferable chemical entity associated with an anti- 
codon corresponding to the anticodons of the detected fragments are used in the 
generation of the next generation library. In anotfier embodiment, additional building 

15 blocks are added having modified transferable chemical entities in order to improve 
on a certain proper^ of the encoded molecule. 

The complexes may be prepared by various known metiiods startinig from the 
nucleic acid fragment comprising tiie anticodon and the chemical entity, as 

20 disclosed above. According to a particular method, the next generation library is 
formed by a) mixing under hybridisation conditions, nascent brfunctional complexes 
comprising a chemical entity or a reaction product of chemical entities, and an 
identifier nucleic acid sequence comprising codon(s) identifying said chemical 
entiti'es, with the recovered nucleic add fragments, said fragments comprising an 

25 oligonucleotide sufficient complementary to at least a part of the identifier nudek: 
acid sequence to allow for hybridisation, a transferable chemical entity and an 
anticodon identifying the chemical entity, to fonn hybridisation products; and b) 
ti-ansfening the chemical entities of the nucleic add firagments to the nascent 
bifuncfionar complexes through a reaction involving a reactive group of the nascent 

30 bifuncbonal complex, in conjunction with a taansfer of the genetic Information of the 
anticodon. . 

Preferably, the above metiiod for preparing the next generation library comprises 
tiie furttier step of c) separating tiie components of the hybridisation product and 
35 recovering the complexes. If further chemical entities are intended to participate in 
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the formation of the encoded molecule of the nascent complex, steps a) through c) 
are repeated as appropriate using the recovered complexes in step c) as the nas- 
cent bifunctional complexes in step a) of the next round. 

5 The genetic inforhiation of the anticodon may be transferred to the nascent complex 
by a variety of methods. According to a first embodiment the genetic information of 
the anticodon is transferred by enzymatically extending the oligonucleotide identifier 
region to obtain a codon attached to the bifunctional complex having received the 
chemical entity. A second embodiment implies that genetic infomnation of the antl- 
10 codon is transfen-ed to the nascent compIexe3 by hybridisation to a cognate codon 
of tiie nascent complex. 

According to the first embodiment, the enriched pool of fragments comprises an 
affinity oligonucleotide sufficient complementary to an identifier region of the nascent 

15 complex, said oligonucleotide being distinct from the anticodon. Accordingly, tiie 
oligonucleotide identifier region of tiie nascent complex anneals to the affinity oli- 
gonucleotide of the building block to form the hybridisation product, while the anti- 
codon remains single stranded. Subsequently, tiie chemical entity is transferred to 
the recipient reactive group of tiie complex to form the encoded moleciile prior to, 

20 simultaneously witii, or subsequent to tiie enzymatically extension of the hybridisa- 
tion product using the anticodon as Identifier. Specific examples of suitable enzymes 
are polymerases and ligases, which requires dNTPs and oligonucleotides, respec- 
tively as substrates. The metiiod for fomiing the complexes according to ttiis first 
embodiment is the subject PCT/DK03/00739, tiie content thereof being incorporated 

25 herein by reference. 

According to the second embodiment, the anticodon form part of the affinity oligonu- 
cleotide, Le. the anticodon is a part of or tiie entire affinity oligonucleotide. Inrtially, a 
plurality of identifiers comprising different codons and/or different order of cpdons Is 

30 provided. The identifiers are associated with a recipient reactive group, i.e. tiie reac- 
tive group may be covalently attached to the identifier or attached by hybridisation. 
Notably, a codon of the identifier may be used for tiie attachment of a building block 
harbouring ttie reactive group. The identifiers are subsequentiy contacted witti the 
enriched pool of building blocks, i,e. nudeic add fragments associated witti a trahs^ 

35 ferable chemical entity. The mixture of identifiers and building blocks are maintained 
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at hybridisation conditions to anneal the anticodon of the building blocks to the cog- 
nate codon of the identifier. After or simulf .^nfeously with the annealing step, the 
chemical entity is transfen-ed to the recipieiftVeactive group.of the identifier. The 
method for forming the complexes according to the second embodiment is the sub- 
5 ject of various patent applications. Including W6 02/1 03008; WO 02/074929. Danish 
patent application No. PA 2002 01347, and US provisional patent application No. 
60/409,968. The content of these patent applications are incorporated herein by 
reference in their entirety. 

10 The new generation of library complexes may be used In a partition step, in which 
the library of complexes is subjected to a condition partitioning complexes displaying 
a predetermined property from the remainder of the next generation library, as ex- 
plained above. Thus, using the present method, it is possible to repeat the partition- 
ing procedure a desired number of times using still more stringent conditions, until a 

1 5 single or a few encoded molecules are Identified which display the desired property 
to a high extent When tiie partitioning Is based on an affinity assay, the library of 
encoded molecules are increasingly nan"owed in size from one generation to the 
next and at the same time the high affinity binders are increased In concentration. 

20 The outcome of a codon analysis will be dependent of the enrichment factor in the 
selection process. An efficient and specific selection will generate a large difference . 
between the specific binders compared to tiie bacl<ground. Still, tiiere will be a large 
amount of molecules in tiie bacl<ground tiiat will reduce the possibility to obtain 
measurable differences between tiie binders and tiie background in the codon 

25 analysis procedure. If tiie enrichment factor (or too large library) Is not good enough 
to distinguish a specific binder among the background binders, tiie signal in the 
codon analysis will probably not be detectable. However, there will be a continuing 
of binders tfiat use a certain chemical entity in a certain position. These "noi opti- 
mal" binders (a certain important chemical entity in one position and less important 

30 In the otfier position) will be many due to the diversity obtained when only one (or a 
few) positions are important in the selection process. Therefore, the sum of all mole- 
cules with a preferable chemical entity in a certain position will be larger than the 
sum of all molecules witfi a non-binding chemical enti'ty, which will iriake the codon 
analysis easier. 
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This invention may involve an extensive analysis of ail the chemical entities in a li- 
brary and how they are involved in the binding to targets. This information can be 
used both to design new libraries and in the final process where the lead structures 
are produced and pre-clinical candidates are picked. The extensive data obtained In 
5 the codon analysis can for instance be used for selecting candidates with the appro- 
priate specificity. This can be done if selection has been performed on a family of 
proteins where one of the members is the target 

The invention enables phannacophore identification and transfomiation Into small 
10 molecule dmgsJn cases where peptide-like libraries is used, the pep- 

tide/petidomimetic lead to small molecule conversion process is supported by me- 
dicinal chemistry and cheminfomiatics and guided by matching the phannacophore 
derived from massive stmcture activity relationship (SAR) data information from the 
codon analysis. A "phannacophore" is a description of the structural criteria a mole- 
1 5 cule must fulfil in order that it is active against a specified biological receptor. These 
criteria are usually the 3D spatial relationships of a set of chemical features, and 
sometimes include the steric boundaries, within which the molecule must fit There 
is a set of software methods, which automatically infers such phannacophores. 
given a SAR. in the absence of direct macromolecular staictural data. 

20 

The extensive SAR infonnation obtained using the codon analyses described in this 
invention can be combined with molecular modeling technologies to refine for ex- 
ample phannacophore models and the plausible Interactions between the potential 
binders and a target 

25 

The codon analysis is also a valuable experimental tool for SAR on weak binders. 
The codon analysis measures the abundance of chemical entities after a selection In 
all binding molecules. Thus, even week binders, which there might be many of, is 
detected even though the detected codon is selected in many different corVibina- 
30 lions. The selection procedure can also be tuned to enrich predominately for vireak 
binders, vyhich will simplify the codon analysis data. 

This invention is also suitable for replacing the laborious task of extracting SAR in- 
fonnation by hand with an automated process using suitable algorithm and software 
35 programs. The codon analysis (e.g. an^y or QPCR measurements) can be directly 
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feed into a data handling software program that use both the codon abundances 
_and.s.tructuraLdat9jo generate SAR intprmation and potential pharmacophore mod- 
els. 

5 The SAR infonmation and potential pharmacophore models obtained from the codon 
analysis can be used to design focused libraries in an array format allowing massive 
and parallel testing. Thus, the selection procedure and codon analysis can be seen 
as a diversity reduction step to allow a complete test of potential binders in an an-ay 
format 

10 

Various methods for identifying the codons of the identifiers of step iii) are disclosed 
herein. When a pool of partitioned identifier nucleic acid sequences is subjected to 
the identification step It is normally not practically to decode a sufficient number of 
sequences comprising the entire "genome" of an encoded molecule to ensure that 
15 all interesting encodied molecules have been revealed. Therefore, a modified se- 
quencing technique preferably identifies the codons in each position occum'ng with 
the highest frequency. The next generation library is then build usirig in each posi- 
tion the chemical entities occurring with the highest frequency. 

20 In a certain embodiment ofthe Invention, the codon identification step uses the en- 
tire population of identifier nucleic acid sequences in tiie analysis and informs the 
experimenter of the relative abundance of each codon in a certain position. The 
codon infomiation may be obtained using mlcroarray. QPCR, or any equivalent 
method for revealing the identity of codons. In contrary, sequencing a subset of 

25 identifier nucleic acid sequences only provides the experimenter with a limited in- 
sight as to the population of codons and the corresponding encoded molecules. 

; Detailed Description ofthe Invention 
Complex 

30 TTie complex comprises an encoded molecule and an identifier oligonucleotide. The 
identifier comprises codons tfiat identity the encoded molecule. Preferably, the iden- 
tifier oligonucleotide identifies the encoded molecule uniquely, i.e. in a library of 
complexes a particularidentifier is capable of distinguishing the molecule it is at- 
tached to from the rest of the molecules. 
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The encoded molecule and the identifier nnay be attached directly to each other or 
through a bridging moiety. In one aspect of the invention, the bridging moiety is a 
selectively cleavable linkage. ■ ~ ^ 

5 The identifier oligonucleotide may comprise two or more codons. In a preferred as- 
pect the identifier oligonucleotide comprises three or more codons. The sequence of 
each codon can be decoded utilizing the present method to identify reactants used 
in the fomiation of the encoded molecule. When the identifier comprises more than 
one codon, each member of a pool of chemical entities can be identified and the 
1 0 order of codons is infonmative of the synthesis step each member has been Incorpo- 
rated in. 

In a certain embodiment, the same codon is used to code for several different 
chemical entities. In a subsequent identification step, the structure of the encoded. 

15 molecule can be deduced taking advantage of the knowledge of different attachment 
chemistries, steric hindrance, deprotection of orthogonal protection groups, etc. In 
anotiier embodiment, the same codon is used for a group of chemical entities hav- 
ing a common property, such as a lipophilic nature, a certain attachment chemistry 
etc. In a preferred embodiment, however, tfie codon is unique i.e. a similar combina- 

20 tlon of nucleotides does not appear on the identifier oligonucleotide coding for an- 
other chemical entity. In a practical approach, for a specific chemical entity, only a 
single combination of nucleotides is used. In some aspects of the invention, it may 
be advantageous to use several codons for the same chemical entity, much In the 
same way as Nature uses up to six different codons for a single amino add. The two 

25 or more codons kJentifyir^ the same chemical entity may carry further information 
related to different reaction conditions. 

^ The sequence of the nucleotides in each codon may have any suitable length. The 
codon may be a single nucleotide or a plurality of nucleotides. In some aspects of 
3d the Invention, it is preferred tiiat each codon independentiy comprises four or more 
nucleotides, more prefen^ed 4 to 30 nucleotides. In some aspects of the invention 
tiie lengtiis of the codons vary. 

A certain codon may be distinguished firom any other codon in the library by only a 
35 single nucleotide. However, to facilitate a subsequent decoding process and to in- 
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crease the ability of the primer to discriminate between codons It is in general de- 
-sired to have two or more mismatches between a particular codon and any other 
codon appearing on identifier oligonucleotide. As an example, if a codon length of 5 
nucleotides is selected, more than 100 nucleotide combinations exist in which two or 
5 more mismatches appear. For a certain number of nucleotides in the codon, It is 
generally desired to optimize the number of mismatches between a particular codon 
relative to any other codon appearing in the library. 

- The identifier oligonucleotide will in general have at least two codons arranged In 
10 sequence, l.e. next to each other. Two neighbouring codons may be separated by a 
framing sequence. Depending on the encoded molecule formed, the identifier may . 
comprise further codons, such as 3, 4, 5, or more codons. Each of the further 
codons may be separated by a suitable framing sequence. Preferably, all or at least 
a majority of the codons of the identifier are separated from a neighbouring codon 
15 by a framing sequence. The framing sequence may have any suitable number of 
nucleotides, e.g. 1 to 20. Alternatively, codons on the identifier may be designed 
with overiapping sequences. 

The framing sequence, if present, may serve various purposes. In one setup of the 
20 invention, the framing sequence Identifies the position of the codon. Usually, the 
framing sequence either upstream or downstream of a codon comprises infonmation 
which positions the chemical entity and the reaction conditions in the synthesis his- 
tory of the encoded molecule. The framing sequence may also or in addition provide 
for a region of high affinity. The high affinity region may ensure that a hybridisation 
25 event with an anti-codon will occur In frame. Moreover, the framing sequence may 
adjust the anneallrig temperature to a desired level. , 

A framing sequence with high affinity can be provided by incorporation of one or 
more nucleobases forming three hydrogen l>onds to a cognate nucteobase. Exam- 
30 pies of nucleobases havirrg this property are guanine and cytosine. Alternatively, or 
in addition, the framing sequence may be subjected to backbone modification. Sev- 
eral back bone modifications provides for higher affinity, such as 2*-Omettiyl substi- 
tution of \he ribose moiety, peptide nucleic acids (PNA), and 2'-4' O^ethylene cy- 
clisation of the ribose moiety, also referred to as LNA (Locked Nucleic Add). 
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The sequence comprising a codon and an adjacent framing sequence has in a cer- 
tain aspect.of the invention a total length of 11 nucleotides or more, preferably 15 
nucleotides or more. A prirner-mayb6-aec..: id to complementary to the codon se- 
quence as well as the framing sequence. Tfie presence of an extension reaction 
5 under conditions allowing for such reaction to occur Is Indicative of the presence of 
the chemical entity encoded in the codon as well as the position said chemical entity 
has in the entire synthesis Wstoiy of the encoded molecule. 

The identifier may comprise flanking regions around the coding section. The flanking 
10 regions can also sen/e as priming sites for amplification reactions, such as PGR or 
as binding region for oligonucleotide probe. The Identifier may in certain embodl- 
ments comprise an affinity region having the property of being able to hybridise to a 
building block. 



15 



20 



It is to be understood that when the tenn Identifier oligonucleotWe Is used In the pre- 
sent descripfion and claims, the Identifier oligonucleofide may be in the sense or the 
anti-sense fomiat. I.e. the identifier can be a sequence of codons which actually 
codes for the encoded molecule or can be a sequence complementary thereto. 
Moreover, the idenfifier may be single-stranded or double-stranded, as appropriate. 



The encoded molecule part of the complex is generally of a structure expected of 
having an effect according to the property sought for. e.g. the encoded molecule has 
a binding affinity towards a target. When the target is of phamiaceutical importance, 
the encoded molecule is generally a possible drug candWate. The complex may be 
25 fornied by tagging a library of different possible dnig candidates wHh a tag. e.g. a 
nucleic add tag Identifying each possible dmg candidate. In another embodiment of 
the Invention, the molecule formed by a variety of reactants which have reacted with 
each other and/or a scaffold molecule. Optionally, this reaction product may be post- 
modified to obtain the final molecule displayed on the cpniplex. The post- 
30 modification may involve the cleavage of one or more chemical bonds attaching the 

encoded molecule to the Wentifier in order more effldenUy to display the encoded 

molecule. 



35 



The fomiation of an encoded molecule generally starts by a scaffold, i.e. a chemical 
unit having one or more reactive groups capable of fonning a connection td anottier 
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reactive group positioned on a chemical entity, thereby generating an addition to the 
original scaffold. A second chemical entitv may react with a reactive group also 
appearing on the original scaffold or a reactive group incorporated by the first 
chemical entity. Further chemical entities may be involved in the fonnation of the 
5 final reaction product The formation of a connection between the chemical entity 
and the nascent encoded molecule may be mediated by a bridging molecule. As an 
example, if the nascent encoded molecule and the chemical entity l)otii comprise an 
amine group a connection between these can be mediated by a dicarboxylic add. A 
syntfietic molecule is in general produced in vitro and may be a naturally occuning 
10 or an artificial substance. Usually, a synthetic moleculie is not produced using the 
naturally translation system in an in vitro process. 

The chemical entities that are precursors for structural additions or eliminations of 
the encoded molecule may be attached to a building block prior to the participation 
15 in the fonnation of the reaction product leading the final encoded molecule. Besides 
tiie chemical entity, the building block generally comprises an anti-codon. In some 
embodiments the building blocks also comprise an affinity region providing for 
affinity towards the nascent complex. 

20 Thus, tfie chemical entities are suitably mediated to the nascent encoded molecule 
by a building block, which further comprises an anticodon. The anti-codon serves 
the function of transferring the genetic information of the building block in 
conjunction with the transfer of a chemical entity. The transfer of genetic infonnation 
and chemical entity may occur in any order. The chemical entities are preferably 

25 reacted witiiout enzymatic interaction in some aspects of the invention. Notably, thie 
reaction of the chemical entities is preferably not mediated by ribosomes or 
enzymes having similar activity. In other aspects of tiie invention, enzymes are used 
to mediate the reaction between a chemical entity jjnd a nascent encoded molecule, 

30 According to certain aspects of the invention the genetic infomiation of the anti- 
codon is transfen-ed by specific hybridisation to a codon on a nucleic acid klentifier. 
Anotiier metiiod for transfem'ng tfie genetic information of the anti-codon to tiie 
nascent complex is to anneal an oligonucleotide complementary to the anti-codon 
and attach tfiis oligonucleotide to tiie complex, e.g. by ligation. A still furttier method 
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Involves transferring the genetic infomiation of the anti-codoh to the nascent 
complex by an extension reaction using a polymerase and a mixture of dNTPs. 

The chemical entity of the bunding block may in most cases be regarded as a pre- 
5 cursor for the structural entity eventually incorporated into the encoded molecule. In 
other cases the chemical entity provides for the eliminations of chemical units of the 
nascent encoded molecule. Therefore, when it in the present application with claims 
is stated that a chemical entity is transfen-ed to a nascent encoded molecule it is to 
be understood that not necessarily all the atoms of the original chemical entity is to 

10 be found in the eventually fomied encoded molecule. Also, as a consequence of the 
reactions involved In the connection, the structure of the chemical entity can be 
changed when it appears on the nascent encoded molecule. Especially, the cleav- 
age resulting in the release of the entity may generate a reactive group which in a 
subsequent step can participate in the fonnation of a connection between a nascent 

15 complex and a chemical entity. 

The chemical entity of the building block comprises at least one reactive group ca- 
pable of participating in a reaction which results in a connection between the chemi- 
cal entity of the building block and another chemical entity o.r a scaffold assodated 

20 with the nascent complex. The number of reactive groups which appear on the 
chemical entity is suitably one to ten. A building blod< featuring only one reactive 
group is used i.a. in the end positions of polymers or scaffolds, whereas building 
blocks having two reactive groups are suitable for the formation of the body part of a 
polymer or scaffolds capable of being reacted further. One, two or more reactive 

25 groups intended for the fonnation of connections, are typically present on scaffolds. 
Non-limiting examples of scaffolds are opiates, steroids, benzodiazepines, hydan- 
toines, and peptidylphosphonates. 

♦ • ■ 

The reactive group of ttie chemical entity may be capable of forming a direct con- 
30 neetion to a reactive group of the nascent complex or the reactive group of the build- 
ing block may be capable of fomiing a connection to a reactive group of tiie nascent 
complex through a bridging filWn group. It is to be understood tiiat not all ttie atoms 
of a reactive group are necessarily maintained in the connection fomied. Ratiier, tfie 
reactive groups are to be regarded as precursors for the structure of the connecfioa 
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The subsequent cleavage step to release the chemical entity from the building block 
-can be performed in-any appropriate-way. In an aspect of the invention the cleavage 
involves usage of a chemical reagent or an enzyme. The cleavage results in a trans- 
fer of the chemical entity to the nascent encoded molecule or in a transfer of the 
5 nascent encoded molecule to the chennical entity of the building block. In some - 
cases it may be advantageous to introduce new chemical groups as a consequence 
of linker cleavage. The new chemical groups may be used for further reaction in a 
subsequent cycle, either directly or after having been activated. In other cases it is 
desirable that no trace of the linker remains after the cleavage. 

10 

In another aspect the connection and the cleavage is conducted as a simultaneous 
reaction, i.e. either the chemical entity of the building block or the nascent encoded 
molecule is a leaving group of the reaction. In some aspects of the invention, it is 
appropriate to design the system such that the connection and the cleavage occur 
15 simultaneously because this will reduce the number of steps and the complexity. 
The simultaneous connection and cleavage can also be designed such that either 
no trace of the linker remains or such that a new chemical group for further reaction 
is introduced, as described above. 

20 The attachment of the chemical entity to the building block, optionally via a suitable 
spacer can be at any entity available for attachment, e.g. the chemical entity can be 
attached to a nucleobase or the backbone. In general, it is preferred to attach the 
chemical entity at the phosphor of the intemucleoside linkage or at the nucleobase. 
When the nucleobase is used for attachment of the chemical entity, the attachment 

25 point is usually at the 7 position of the purines or 7-deaza-purins or at the 5 position 
^ of pyrimidines. The nucleotide may be distanced from tiie reactive group of the 
chemical entity by a spacer moiety. The spacer may be designed such that the con- 
fonnational spaced sampled by the reactive group is optimized for a reaction with 
the reactive group of the nascent encoded molecule, 

30 

The encoded molecules may have any chemical structure. In a prefen-ed aspect, the 
encoded molecule can be any compound that may be synthesized In a component- 
by-component fashion. In some aspects the synthetic molecule Is a linear or 
branched polymer. In another aspect the synthetic molecule Is a scaffolded 
35 molecule. The temi "encoded molecule" also comprises naturally occurring 
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molecules like a-polypeptides etc, however produced in vitro usually in the absence 
of enzymes, like ribosomes. In certain aspects, the synthetic molecule of the library 

is a non-a-polypeptide. • — 

• . / 

5 The encoded molecule may have any molecular weight However, in order to be 
orally available. It is in this case prefenred that the synthetic molecule has a 
molecular weight less than 2000 Daltons, preferably less than 1 000 Dalton, and 
more preferred less than 500 Daltons. 

1 0 The size of the library may vary considerably pending on the expected result of the 
Inventive method. In some aspects, it may be sufficient that the library comprises 
two, three, or four different complexes. However, in most events, more than two 
different complexes are desired to obtain a higher diversity. In some aspects, the 
library comprises 1 .000 or more different complexes, more prefenBd 1 ,000.000 or 

15 more different complexes. The upper limit for the size of the library is only restricted 
by the size of the vessel in which the library is compriised. It may be calculated that a 
vial may comprise up to 10^^ different complexes. 

Methods for formino libraries of complexes 

20 The encoded molecules associated with an Identifier oligonucleotide having two or 
more codons that code for reactants that have reacted in the fomnation of the mole- 
cule part of the complex may be fomied by a variety of processes. Generally, the 
prefen-ed methods can be used for the fonnation of virtually any Idnd of encode 
molecule. Suitable examples of processes include prior art methods disclosed in 

25 WO 93/20242. WO 93/06121. WO 00/23458. WO 02A)74929. and WO 02/103008, 
the content of which being incorporated herein by reference as well as methods of 
the present applicant not yet public available, including the methods disclosed in 
PCT/DK03/00739 filed 30 October 2003, and DK PA 2003 00430 filed 20 March 
2003. Any of these methods may be used, and the entire content of the patent ap- 

30 plications are Included herein by reference. 

Below five presently preferred embodiments are described. A first embodiment 
disclosed in more detail in WO 02/1 03008 is based on tiie use of a polymerase to 
incorporate unnatural nucleotides as building blocks. Initially, a plurality of identifier 
35 oligonucleotides is provided. Subsequentiy primers are annealed to each of ttie 
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identifiers and a polymerase is extending the primer using nucleotide derivatives, 
which have appended chemical entities. Subsequent to or simultaneously with tfie 
incorporation of the nucleotide derivatives, the chemical entities are reacted to forni 
a reaction product The encoded molecule may be post-modified by cleaving some 
5 of the linking moieties to better present the encoded molecule. 

Several possible reaction approaches for tiie chemical entities are apparent f^rst 
tiie nucleotide derivatives can be incorporated and tiie chemical entities 
subsequently polymerised. In the event tiie chemical entities each cany two reactive 

1 0 groups, the chemical entities can be attached to adjacent chemical entities by a 
reaction of these reactive groups. Exemplary of the reactive groups are amine and 
carboxylic add, which upon reaction fonm an amide bond. Adjacent chemical entities 
can also be linked together using a linking or bridging moiety. Exemplary of ttiis 
approach is the linking of two chemical entities each bearing an amine group by a 

15 bi-caritK)xylic acid. Yet another approach is the use of a reactive group between a 
chemical entity and tiie nucleotide building block, such as an ester or a holster 
group. An adjacent building block having a reactive group such as an amine may 
cleave the interspaced reactive group to obtain a linkage to tiie chemical entity, e.g. 
by an amide, linking group. 

20 

A second embodiment for obtainment of complexes disclosed in WO 02/1 03008 
pertains to the use of hybridisation of building blocks to an identifier and reaction of 
chemical entities attached to the building blocks in order to obtain a reaction 
product This approach comprises that identifiers are contacted witti a plurality of 
25 building blocks, wherein each building bkxsk comprises an anti-codon and a 

chemical entity. The anti'-codons are designed such tiiat tiiey recognise a sequence, 
i.e. a codon, on the identifier. Subsequent to the annealing of tiie anti-codon and the 
codon to each otiier a reaction of the chemical entity is effected. 

30 The identifier may be associated with a scaffold. Building blocks bringing chemical 
entities in may be added sequentially or simultaneously and a reaction of the 
reactive group of tiie chemical entity may be effected at any time after tfie annealing 
of tfie building blocks to the identifier. 
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A third embodiment for the generation of a complex includes chemical or enzymatic 
ligation of building blocks when these are lined up on a identifier. Initially, identifiers 
are provided, each having one or more codons. The identrfiers are contacted with 
building blocks comprising anthcodbns linked to chemical entities. The two or more 
5 . anti-codons annealed on an identifier are subsequently ligated to each other and a 
reaction of the chemical entities is effected to obtain a reaction product The method 
is disclosed in more detail in DK PA 2003 00430 filed 20 March 2003. 

A fourth embodiment makes use of the extension by a polymerase of an affinity 
10 sequence of the nascent complex to transfer the anti-codon of a building block to the 
nascent complex. The method implies that a nascent complex comprising a scaffold 
and an affinity region is annealed to a building block comprising a region 
complementary to the affinity section. Subsequently, the anti-codon region of the 
building block is transferred to the nascent complex by a polymerase. The transfer 
15 of the chemical entity may be transfen-ed prior to, simultaneously with or subsequent 
to the transfer of the anti-codon. This method is disclosed in detail In 
PCT/DK03/00739. 

A fifths embodiment also disclosed in PCT/DK03/00739 comprises reaction of a 
20 reactant with a reaction site on nascent bifunctional molecule and addition of a 
nucleic acid tag to the nascent bifunctional molecule using an enzyme, such as a 
ligase. When a library is fomied, usually an array of compartments Is used for 
reaction of reactants and enzymatic addition of tags with the nascent bifunctional 
molecule. 

25 

Thus, the codons are either pre-made into one or more identifiers before the 
encoded molecules are generated or the codons are transferred simultaneously with 
the fonmation of the encoded molecules- 

30 After or simultaneously vyith the formation of the reaction product some of the linkers 
to the identifier may be cleaved, however, usually at least one linker is maintained to 
provide for the complex. 
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Nucleotides 

TTie nucleotides used In the present invention may be iiniced together in a sequence 
of nucleotides, l.e. an oligonucleotide. Each nucleotide monomer is nomially com- 
5 posed of two parts, namely a nucleobase moiety, and a backbone. The backbone 
may in some cases be subdivided into a sugar moiety and an intemucleoside linker. 

The nucleobase moiety may be selected among naturally occumng nucleobases as 
well as non-naturaily occurring nucleobases. Thus, "nucleobase" includes not only 

10 the known purine and pyrimidine hetero-cydes, but also heterocyclic analogues and 
tautbmers thereof. Illustrative examples of nucleobases are adenine, guanine, 
thymine, cytosine, uracil, purine, xanthine, diaminopurine, 8-oxo-N®-methyladenine, 
T-deazaxanthine, 7-deazaguanine, N^,ri*-etiianocytosin, N®,N°-ethano-2.6-diamino- 
purine, 5-metfiylcytosine. 5-(C^-C®>-alkynylcytosine, 5-fIuorouracii, 5-bromouracil, 

15 pseudoisocytosine, 2-hydroxy-5-metfiyM-triazoIopyridine, isocytosine, isoguanine, 
inosine and the "non-naturally occumng" nucleobases described in Benner et al., 
U.S. Pat No. 5,432,272. The tenri "nucleobase" is intended to cover tiiese examples 
as well as analogues and tautomers thereof. Especially interesting nucleobases are 
adenine, guanine, thymine, cytosine, 5-methylGytosine, and uracil. vMch are con- 

20 sidered as the naturally occurring nucleobases in relation to therapeutic and diag- 
nostic application in humans. 

Examples of suitable specific pairs of nucleobases are shown below: 
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Natuial Base Palis 



RsHrUiacil 
R RfCHj.- Thymine 



Adejtine 
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Qftodhs 



Guaidne 



Synthetic Base Pairs 




Baddwn* 

BacMwm 



Synthetic purine bases palrrlng urfth natural pyifanidlnes 




RsH:U»aca 
R=CHv Thymine 



Pylodne 



7-deaza adenine 



Y 

7-deaza guanine 



Suitable examples of backbone units are shown below (B denotes a nucleobase): 
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2X3^yp«^ 3^P,»spl«»„id.,c tiC^ophosphatcs IKA 

The sugar moiety of the backbone is suitably a pentose but may be the appropriate 
5 part of a PNA or a six-member ring. Suitable examples of possible pentoses include 
ribose, 2*-<Ieoxyribose, 2'-0-methyl-ribose, 2-fIour-ribose, and 2M'-0-methylene- 
ribose (l_NA). Suitably the nucleobase is attached to the 1' position of the pentose 
entity. 

10 An intemucleoside linker connects the 3' end of preceding monomer to a 5' end of a 
succeeding monomer when the sugar moiety of the backbone is a pentose, like ri- 
bose or 2-deoxyribose. The intennucleoside linkage may be the natural occurring 
phospodiester linkage or a derivative thereof. Examples of such derivatives include 
phosphorothioate, methylphosphonate, phosphoramkJate, phosphotriester, and 

1 5 phosphodithioate. Furthermore, the intemucleoside linker can be any of a number of 
non-phosphorous-containing linkers known in the art. 
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Preferred nucleic add monomers include naturally occurring nucleosides fomfiing 
part of thetDNA as well as the RNA family connected through phosphodiester link- 
ages. The members of the DNA family include deoxyadenosine. deoxyguanosine. 

5 deoxythymidine. and deoxycytidlne. The members of the RNA family Include adeno- 
sine, guanosine. uridine, cytidine. and inosine. Inosine is a non-specific pairing nu- 
cleoside and may be used as universal base because inosine can pair neariy 
isoenergetically with A, T, and C. Other connpounds having the same ability of non- 
specifically base-pairing with natural nucleobases have been fomned. Suitable com- 

iO pounds which may be utilized in the present invention includes among others the 
compounds depicted below 
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Building block 

The chemical entities or reactants that are precursors for structural additions or 
eliminations of the encoded molecule may be attached to a building block prior to 
5 the participation in the formation of the reaction product leading to the finaj encoded 
molecule. Besides the chemical entity, the builcRng block generally comprises an 
anii-codon. 

The chemical entity of the building block comprises at least one reactive group, ca- 
10 pable of participating in a reaction, which results In a connection between the 

chemical entity of the building block and anotiier chemical entity or a scaffold asso- 
ciated witii the nascent complex. The connection is facilitated by one or more reac- 
tive groups of the chemical entity. The number of reactive groups, which appear on 
tiie chemical entity, is suitably one to ten. A building block featuring only one reao- 
15 tive group is used La. in the end positions of polymers or scaffolds, whereas building 
blocks having two reactive groups are suitable for the fonnation of the body part of a 
polymer or scaffolds capable of being reacted furtiier. One, two or more reactive 
. groups intended for tfie formation of connections are typically present on scaffolds. 

20 The reactive group of the building block may be capable of fomiing a direct connec- 
tion to a reactive group of tiie nascent complex or the reactive group of the building 
block may be capable of forming a connection to a reactive group of tiie nascent 
complex tiirough a bridging fill-in group. It is to be understood tiiat not all the atoms 
of a reactive group are necessarily maintained in the connection formed. Rather, the 

25 reactive groups are to be regarded as precursors for the stmcture of the connection: 

The subsequent cleavage step to release ttie chemical entity from tiie building block 
can be performed in any appropriate way. In an aspect of the invention the cleavage 
involves usage of a reagent or an enzyme. The cleavage results in a transfer of the 

30 chemical ertt'rty to the nascent encoded molecule or in a transfer of the nascent en- 
coded molecule to ttie chemical entity of tiie building block. In some cases it may be 
advantageous to introduce new chemical groups as a consequence of linker cleav- 
age. The new chemical groups may be used for further reaction in a subsequent 
cycle, eitfier directly or after having been activated. In otfier cases it is desirable that 

35 no trace of tiie linker remains after the cleavage. 
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In another aspect the connection and the cleavage are conducted as a simultane- 
ous reaction, i.e. either the chemical entity of the building block or the nascent en- 
coded molecule is a leaving group of the reaction. In general, rt Is preferred to de- 
5 sign the system such that the connection and the cleavage occur simultaneously 
because this will reduce the number of steps and the complexity. The simultaneous 
connection and cleavage can also be designed such that either no trace of the linker 
remains or such that a new chemical group for further reaction is introduced, as de- 
scribed above. 

10. ■ 

The attachment of the chemical entity to the building block, optionally via a suitable 
spacer can be at any entity available for attachment, e.g. the chemical entity can be 
attached to a nucleobase or the backbone. In general, it is preferred to attach the 
chemical entity at the phosphor of the Jntenriucleoside linkage or at the nucleobase. 

15 When the nucleobase is used for attachment of the chemical entity, the attachment 
point is usually at the 7 position of the purines or 7-deaza-purins or at the 5 position 
of pyrimidines. The nucleotide may be distanced from the reactive group of the 
chemical entity by a spacer moiety. The spacer may bei designed such that the con- 
formational space sampled by the reactive group is optimized for a reaction with the 

20 reactive group of the nascent encoded molecule or reactivie site. 

The anticodon complements the codon of ti^e identifier oligonucleotide sequence 
and generally comprises the same number of nucleotides as the codon. Tlie anti- 
codon may be adjoined wttii a fixed sequence, such as a sequence complementing 
25 a frarhing sequence. 

Various specific building blocks are envisaged. Building blocks of particular interest 
* are shown below. 

30 Building blocks transferring a chemical entity to a recipient nucleophilic group 
The building block indicated below is capable of ti-ansfem'ng a chemk:al entity (CE) 
to a redpient nucleophilic group, typically an amine group. The bold lower horizontal 
line Illustrates tiie building block comprising an anti^don and tfie vertical line illus- 
trates a spacer. The 5-membered substituted Nrhydroxysucdnimid (NHS 
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serves as an activator, I.e. a labile bond is formed between the oxygen atom con- 
nected to the NHS ring and the chemical entity. The labile bond may be cleaved by 
a nucleophi!ic group, e.g. positioned on a scaffold 



5 




The 5-membered substituted N-hydroxysuccinimid (NHS) ring serves as an activa- 
tor, i,e. a labile bond is formed between the oxygen atom connected to the NHS ring 
and the chemical entity. The labile bond may be cleaved by a nucleophilic group, 

10 e.g. positioned on a scaffold, to transfer the chemical entity to the scaffold, thus 
converting the remainder of the fragment into a leaving group of the reaction. 
When the chemical entity is connected to the activator tfirough a carbonyl group and 
the recipient group is an amine, the bond formed on the scaffold will an amide bond. 
The above building block Is the subject of WO03078627A2, tiie content of which Is 

15 incorporated herein in their entirety by reference. 

Anotiier building block, which may form an amide bond, is 




20 R may be absent or NO2, CF3, halogen, preferably CI, Br. or I, and Z may be S or O. 
This type of building block is disclosed in WO03078626A2. The content of tiiis pat- 
ent application is incorporated herein in the entirety by reference. 
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, A nucleophiiic group can cleave the linkaae between Z and the carbonyl group 
thereby transferring the chemical entity -(G=0)-CE' to said nucleophiiic group, 

5 Building blocks transferring a chemical entity to a recipient reactive group 
forming a C=C bond 

A building block as shown below is able to transfer the chemical entity to a recipient 
aldehylde group thereby forming a double bond between the carbon of the aldehyde 
and the chemical entity 

10 




The above building block is disclosed In WOP3078445A2. the content of which be- 
15 Ing incorporated herein In the entirety by reference. 

Building blocks transferring a chemical entity to a recipient reactive group 
forming a OC bond 

The below building block is able to transfer the chemical entity to a recipient group 
20 thereby forming a single bond between the receiving moiety, e.g. a scaffold, and the 
chemical entity. 
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\— CE 



The above building block is disclosed in WO03078445A2. the content of which be- 
ing incorporated herein in the entirety by reference. 

5 Another building block capable of transferring a chemical entity to a receiving reac- 
tive group forming a single bond is 



O 

O CE 

i^ii 

o 



10 The receiving group may be a nucleophfle, such as a group comprising a hetero 
atom, thereby fonning a single bond between the chemical entity and the hetero 
atom, or the receiving group may be an electronegative carbon atom, thereby form- 
ing a C-C bond between the chemical entity and the scaffold. The above building 
block is disclosed in WO03078446A2, the content of which is incorporated herein by 

15 reference. 

The chemical entity attached to any of the above building blocks may be a selected 
from a large arsenal of chemical stmctures. Examples of chemical entities are 
H or entities selected among the group consisting of a Ci-Cg alkyl. Cz-Ce alkenyl, 
20 Cz-Cfi alkynyl. C4-Ca alkadienyl. C3-C7 cycloalkyf. C3-C7 cycloheteroaM, aryl. and 
heteroaryl. said group being substituted with 0-3 R^ 0-3 and 0-3 or C1-C3 al- 
. |q^lene-NR*2.C1-C3 alkylene-NR^C(0)R^C1<J3al^^^ 
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kylene-0-NR^2. C1-C2 alkylene-0-NR^C(0)R', C1-C2 aIkylene-0-NR^C(0)OR* sub- 

slitOlea^wiflTtMK®: : : — — 

where R"* is H or selected independently among the group corisisting of 
Ci-Cs alkyl, C2-C6 alkenyl. C2-C6 alkynyl, C3-C7 cycloalkyi, C3-C7 cycloheteroalkyi, 
5 aryii heteroaryl, said group being substituted with 0-3 R° and 

R* is selected independently frofri -N3. -CNO. -C(N0H)NH2. -NHOH, 
-NHNHR*. -C(0)R'. -SnR%. -e(0R**)2. -P(0)(OR^2 or the group consisting of CrCe 
alkenyl, C2-Ce aikynyl. C4-C8 atkadienyi said group being substituted with 0-2 R^ 

where R® is selected independently from H, Ci-Ce alkyl, C3.C7 cydoal- 
10 kyl, aryl or Ci-Ce alkylene-aryl substituted with 0-5 halogen atoms selected from -F, 
-CI, -Br, and -I; and 

is independently seiected from -NO2, -COOR^ -COR*. -CN, 
-OSiR'3. -OR® and -NRV 

R* is H, Ci-Ce alkyl, Cj-Ce alkenyl, Cz-Ce aikynyl, C3-C7 cycloalkyi, aryl 
15 or Ci-Ce alkylene-aryl substituted with 0-3 substituents independently selected from 
-F. -CI. -NO2. -R^ -OR^ -SiR% 

R* Is =0. -F. -CI. -Br. -I. -CN. -NO2, -OR". -NR^. -NR"-C(0)R^ 
-NR**-C(0)OR', -SR«. -S(0)R®, -S(0)2R*, -COOR*. -C(0)NR% and -S(0)2NRV 



20 Cross-link cleavage building blocks 

It may be advantageous to split the transfer of a chemical entity to a recipient reac- 
tive group into two separate steps, namely a cross-linking step and a cleavage step 
because each step can be optimized. A suitable building block for this two-step 
process is illustrated k>eIow: 



25 



^FEP 



I 
A 
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Initially, a reactive group appearing on the chemical entity precursor (abbreviated 
FEP) reacts with a recipient reactive group,\e.g. a reactive group appearing on a 
scaffoTd, thereby fomiing a cross-link. BuDs»y<|uently. a cleavage is performed, usu- 
ally by adding an aqueous oxidising agent such as t Bra, CI2, H*. or a Lewis add. 
5 The cleavage results In a transfer of the group HZ-FEP- to the recipient moiety, such 
as a scaffold. 

In the above formula 

ZisO.S.NR^ 

10 QisHCR^ 

P Is a valence bond, O, S, NR^, or a group Cs-yarylene. Ci^alkylene. 
Ci^O-alky!ene, Ci:^S-alkylene. NR^-alkylene. Ci^alkylene-0. Ci^alkylene-S option 
said group being substituted with 0-3 R^ 0-3 R^ and 0-3 R^or C1-C3 alkylene-NR^2, 
C1-C3 alky^ene-NR^C(0)R^ C1-C3 alkylene-NR^C{0)OR». C1-C2 alkylene-O-NR^a. 

15 C1-C2 alkyIene-0-NR^C(0)R', C1-C2 alkyIene-0-NR^C(0)OR^ substituted witii 0-3 

B Is a group comprising D-E-F, in which 
D is a valence bond or a group Ci-ealkylene. Ci^alkenylene, Ci_ 
ealkynylene. Cwarylene. or Cs-yheteroarylene. said group optionally being substi- 
20 felted wittil to 4 group R", 

E is, when present, a valence bond, O, S, MR"*, or a group Ci, 
ealkylene, Ci^alkenylene, Ci-ealkynylene. Cg-yarylene. or Cs-yheteroarylene. said 
group optionally being substituted witii 1 to 4 group R^\ 

F is. when present, a valence bond, O, S, or NR^ 
25 A is a spacing group distancing the chemical structure from the com- 

plementing element, which may be a nucleic acid. 

R\ R^ and R^ are independent of each otiier selected among tiie 
group consisting of H. Ci-Cs alkyl. Cj-Ce alkenyl. CzCt alkynyl. C4-C8 alkadienyl. 
C3-C7 cycloalkyi, C3-C7 cycloheteroalkyi, aryl. and heleroaryl. said group being sub- 
30 stituted witfi 0-3 R^ 0-3 R' and 0-3 R® or C1-C3 aIkyIene-NR^2, C1-C3 al- 

kylene-NR^C(0)R^ C1-C3 alkylene-NR'C(0)OR^ C1-C2 alkyIene-ONR^2. C1-C2 al- 
kylene-p-NR^C(0)R^ C1-C2 alkylene-0-NR^C(0)OR° substituted witti 0-3 R^ 

FEP is a group selected among the group consisting of H. Ci-Ce alkyl, 
Cz-Ce alkenyl. Cz-Ce alkynyl, C4-C8 alkadienyl. CrCy qrcloalkyi, C3-C7 cycloheteroal- 
35 kyl. aryl. and heteroaryl. said group being substituted witfi 0-3 R^ 0-3 R^ and 0-3 R^ 



wo 2004/074429 



PCT/DK2004/000117 



■ .44 

or Ci-Ca alkylene-NR^ai CfCd aIkylene-NR^C{0)R'. C1-C3 alkylene-NR^C(0)OR*. 

Ci-Cz-^lkytene-O-NRVCrCizalkylene-r 

kylene-0-NR^C(0)OR^ substituted with 0^ R^ 

where R'* is H or selected indejDendently among the group consisting of 
6 CirCe alkyi, Cz-Cg alkenyl. C2-C6 alkynyl, C3-C7 cycloalkyl. C3-C7 cydoheteroaikyi, 
aryl, heteroaryl, said group being substituted with 0-3 R' and 

R® Is selected independently from -N3. -CNO. -.C(NOH)NH2, -NHOH, 
-NHNHR^ -C(0)R*. -SnR^a. -B(OR^2, -P(0)(OR% or the group consisting of Ca-Ce 
alkenyl, C2-C8 alkynyl, C4-C8 alkadienyl said group being substituted with 0-2 R^, 
10 where R* is selected Independently from H, Ci-Ce alkyl, C3.C7 cycloal- 

kyl, aryl or Ci-Ce alkylene-aryl substituted with 0-5 halogen atoms selected from -F. 
-CI, -Br, and -I; and R^ is. independently selected from -NO2, -COOR*, -COR", ^N, 
-OSiR^, -OR^ and -NR^2. 

R* is H. Ci-Co alkyl, Cj-Ce alkenyl. Cz-Ce alkynyl. C3-C7 cycloalkyl, aryl or Ci-Ca 
15 alkylene-aryl substituted with 0-3 substttuents indeperidently selected from -F, -CI, - 
N02.-R^-0R^-SiR^ 

R^ Is =0. -F, -CI. -Br, -I. -CN. -NO2. -0R^ -NR^, -NR^-C(0)R^ -NR*-C(0)OR^ -SR«, 
-S(0)R^ -S{0)2R^ -COOR^ -C(0)NR"2 and -S{0)2NRV 

20 In a prefen-ed embodiment Z is O or S. P is a valence bond, Q is CH. B is CH2, and 
R\ R^. and R^ is H. The bond between the cart)onyl group and Z is cleavable with 
aqueous I2. 

Partitioning conditions 

25 The partition step may be referred to as a selection or a screen, as appropriate, and 
includes the screening of the Hbraiy for encoded molecules having predetenmlhed 
desirable characteristics. Predetenmined desirable characteristics can Include bind- 
ing to a target, catalytically changing tiie target, chemically reacting with a target In a 
manner which alters/modifies the target or the functional activity of the target, and 

30 covalentiy attaching to tiie target as in a suicide inhibitor. 

The target can be any compound of interest. E.g. the target can be a protein, pep- 
tide, carbohydrate, polysaccharide, glycoprotein, hormone, receptor, antigen, anti- 
l)ody, virus, substrate, metabolite, transition state analogue, cofactor, inhibitor, drug, 
35 dye. nutrient, growtii factor, ceil, tissue, etc. without limitation. Particulariy prefen-ed 
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targets include, but are not limited to. angiotensin 

cyclooxygenase. 5-Iipoxygenase. IIL- 1 0 converting enzyme, cytokine receptors. 
PDGF receptor, type II inosine monophospliate dehydrogenase, p-lactamases. in- 
tegrin, and fungal cytochrome P-450. Targets can include, but are not limited to. 
5 bradykinin. neutrophil elastase, the HIV proteins, including tat, rev, gag, int, RT. nu- 
cleocapsid etc.. VEGF, bFGF, TGFP. KGF. PDGF. thrombin, theophylline, caffeine, 
substance P. IgE. sPLA2. red blood cells, glioblastomas, fibrin clots, PBMCs,- hCG. 
lectins, selectins. cytokines, ICP4, complement proteins, etc. 

1 0 Encoded molecules having predetermined desirable characteristics can be parti- 
tioned away from the rest of the library while still attached to the identifier nucleic 
acid sequence by various methods known to one of ordinary skill in the art. In one 
embodiment of the invention the desirable products are partitioned away from the 
entire library without chemical degradation of the attached nucleic acid identifier 

15 such that the identifiers are amplifiable. The identifiers may then be ampli^^ 

still attached to the desirable encoded molecule or after separation from the desir- 
able encoded molecule. 

In a preferred embodiment, the desirable encoded molecule acts on the target with- 
20 out any interaction between the nucleic acid attached to the desirable encoded 
molecule and the target In one embodiment, the bound complex-target aggregate 
can be partitioned from unbound complexes by a number of methods. The methods 
include nitrocellulose filter binding, column chromatography, filtration, affinity chro- 
matography, centrifugation, and other well known methods, 

25 

Briefly, the fibrary of complexes is subjected to the partiltoning step. v\^ich may in- 
. elude contact between the library and a column onto which the target is immobilised. 
Identifier nucleic acids associated with undesirable encoded molecules. I.e. encoded 
molecules not bound to the target under the stringency conditions used, will pass 
30 through the column. Additional undesirable encoded molecules (e.g. encoded mole- 
cules vi/hich cross-read with other targets) may be removed by counter-selection 
methods. Desirable complexes are bound to the column and can be eluted by 
changing the conditions of the column (e.g.. salt. pH. surfactant, etc.) or the identi- 
fier. 
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Additionally, encoded molecules which re^-^^with a target can be separated from 
those products that do not react with the target In one example, a chemical com- 
pound which covalently attaches to the target (such as a suicide inhibitor) can be 
5 washed under very stringent conditions. The resulting complex can then be treated 
with proteinase, DNAse or other suitable reagents to cleave a linker and liberate the 
nucleic acids which are associated with the desirable chemical compound. The lib- 
erated nucleic acids can be amplified. 

10 In another example, the predetermined characteristic of the desirable product is the 
ability of the product to transfer a chemical group (such as acyl transfer) to the target 
and thereby inactivate the target One could have a product library where all of the 
products have a thioester chemical group. Upon contact with the target, the desir- 
able products will transfer the chemical group to the target concomitantly changing 

15 the desirable product from a thioester to a thiol. Therefore, a partitioning method 
which would identify products that are now thiols (rather than thioesters) will enable 
the selection of the desirable products and amplification of the nucleic acid assod* 
ated therewith. 

20 There are other partitioning and screenirig processes, which are compatible with this 
invention that are known to one of ordinary skill In the art In one embodiment, the 
products can be fractionated by a number of common methods and then each frac- 
tion is then assayed for activity. The fractionization methods can include size, pH, 
hydrophobicity, etc. 

25 

Inherent in tiie present method is the selection of encoded molecules on the basis of 
a desired function; this can be extended to the selection of molecules with a desired 
function and specificity. Specificity can be requl^ied during the selection process by 
first extracting identifier nucleic acid sequences of chemical compoiir)ds which are 

30 capable of Interacting witii a non-desired "target" (negative selection, or counter- 
selection), followed by positive selection witii the desired target As an example, 
inhibitors of fungal cytochrome P-450 are known to cross-react to some extent with 
mammalian cytochrome P-450 (resulting in serious side effects). Highly specific in- 
hibitors of the fungal cytochrome could be selected from a library by first removing 

35 those products capable of interacting with tiie mammalian cytochrome, followed by 
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retention of the remaining products which are capable of interacting with the fungal 
cytochrome. 



Brief Oescription of the Figures 

Fig. 1 illustrates the overall process of building block evolution. 
Fig. 2 shows the distribution of codon in different positions in an output from a selec- 
tion, 

Rg. 3 shows tiie difference between identifier driven and building block driven evolu- 
tion. 

Fig. 4 shows a method for reducing the library diversity tiirough codon analysis. 
Rg. 5 discloses two embodiments of using a Taqman probe (5' nuclease probe) in 
the measurement of the presence or absence of a certain codon. 
Fig. 6 shows a standard curve refenred to in example 4. 
Rg. 7 shows a result of example 4. 
Rg. 8 discloses a result of example 4. 

Rg. 9 discloses a scheme relating to combined structural Information and codon 
abundances in library design. 

Rg. 10 discloses a relationship between codon analysis and structural information. 

Fig. 1 1 shows the detection of single codons of identifiers. 

Fig. 12 shows tiie detection of codon pairs of identifiers. 

Rg. 1 3 shows the detection of codon pairs at specific codon positions. 

Rg. 1 4 shows the detection of single codons of Identifiers after the separation of the 

individual codons- 

Rg. 15 discloses a metiiod for selecting from a library, complexes capable of bind- 
ing to a target molecule. 

Fig. 16 discloses a metiiod for enriching specific nudeic add fragments and tiie util- 

ity of tfiese fragments for tiie generation of a new library, 

Rg, 17 disdoses a method for reducing tiie diversity of a library of complexes. 

Detailed description of the figures 

Rg. 1A Shows tiie prindple steps in BB evolution. An initial library of desired size is 
produced. This initial library is subjected to a selection process where encoded 
molecules, that associate witfi a target of interest are enriched. The encoding identi- 
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fier oligonucleotide is preferably amplified and the used in the codon analysis step. 
This step monitors the relative abundance of each codon in the selected library. The 
information obtained in this analysis is used to design a new enriched library, which 
contains the preferable chemical entities and their corresponding codons. This new 
library is then subjected to a new selection process to select for binders. This diver- 
sity reduction cycle can be repeated until tiie desirable result is obtained and the 
binders have been obtained. 

Fig. 1B shows how tfie diversity of a library (n*) is reduced by reducing the number 
10 of chemical entities (n) in tiie library. Thus, by removing chemical entities not in- 
volved in tiie encoded molecules partitioried, a reduction in library diversity can be 
obtained to allow the identification of binders. 

The identifier oligonucleotide that encodes for the display molecule is composed of 
15 codons and associated with the encoded molecule, as shown in Fig. 2. These 

codons possess information about the chemical entitles in the encoded molecule. 
Each of these codon positions can be analysed for tiie precise sequence, which will 
reflect which chemical entities ttiat have been enrich for In the selection process. 
The relative amount can also be obtained by comparing ttie signal in the measuring 
20 procedure (e.g. QPCR and array analysis). Each codon position will have its own 
fingerprint on which chemical entities that the selected display molecules possess. 
These fingerprints in each position can subsequentiy be used to put together a new 
more focused library witti a lower and more enrich diversity that can be subjected to 
another round of selection. This can then repeated until the preferable encoded 
25 molecules have been obtained. 

Fig. 3 illustrates the main difference k}etween identifier and chemical entity (CE) evo- . 
lution. In botii cases the Initial selection starts on a library witii certain diversity. After 
tiie first round of selection tiie encoding Identifiers are ampRfied where the distn'bu- 

30 tion Is maintained. This distiibution is then transferred to the next generation which 
is used in a new selection. Thus, the strongest binders that were enriched in tine first 
round of selection will be present at a relatively higher concentration compared to 
tiie weaker binders and the background. In the CE-driven evolution tiie codon 
analysis is used to design a new library. In this example, the new library Is con- 

35 structed to contain all the chemical entities that were identified as a positive signal In 
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the analysis. In other words, all the chemical entities that were not detected through 
the codon analysis were excluded in the new library. The new library is designed to 
have an equal amount of each selected chemical entity, which will generate all the 
possible display molecules at the same concentration. This will allow all binders to 
5 compete at the same concentration and potentially retain a more diverse set of 
binders in each round of selection. This is especially important for small molecules 
here not only the affinity is of interest 

Figure 4. This illustrates the process where the diversity Is reduced through the 
10 codon analysis. An initial library of 10^° (e.g. 317*317*317*317) library members is 
subjected to a selection. The enrich identifier oligonucleotides are amplified and 
used in the codon analysis. The codon analysis result is used to design a new 10^ 
(e.g. 57*57*57*67) library where the enriched chemical entities are Included. This 
new library is tiie again subjected to a selection process. The identifier oligonucleo- 
15 tides are amplified and used for codon analysis. This new codon analysis results is 
again used to design a new 10* (e.g. 1 0*1 0*1 0*1 0) library where tiie enriched 
chemical entities are included. Rnally a last selection step is perfonned in this re- 
duced diversity library to identify the binders, 

20 A preferred embodiment of the invention utircdng a universal Taqman probe is 
shown in Flg.5. Four codons are shown (PI ttirough P4; bold pattern) along witii 
flanking regions (light pattern). A universal Taqman probe anneals to a region adja- 
cent to tiie codon region, but vwttiin the amplicon defined by the universal PGR 
primers Pr.l and Pr. 2. These prirners could be the same as used for amplification of 

25 the identifier oligonucleotides encoding binders after an enrichment process on a 
specific target However, are minimal length identifiers prefenred during the encoding 
process, tiie region involved in Taqman probe annealing could be appended to the 
library identifier oligonucleotides by e.g. overiap PGR, ligation, or by employing a 
long downstream PGR primer containing tiie necessary sequences. The added 

30 lengtfi corresponding to the region necessary for annealing of tiie Taqman probe 
would fae form 20 to 40 nts depending on tiie type of TaqMan probe and Ta of tfie 
PGR primers. The Q-PCR reactions are preferably perfonned in a 96- or 384-well 
fomiat on a real-time PGR tfiennocycling machine. 
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Panel A shows the detection of abundance of a specific codon sequence in posttion 
one. Similar primers are prepared for all rodon sequeaces, Fone.acb codon se- 
quence utilized to encode a specific BB in^'e library a Q-PCR reaction is performed 
with a primer oligonucleotide complementary to the codon sequence in question. A 
5 downstream universal reverse primer Pr. 2 is provided after the Taqman probe to 
provide for an exponential amplification of the PGR amplicon. The setup is most 
suited for cases where the codon constitutes a length corresponding to a length suit- 
able for a PGR primer. 

1 0 Panel B shows the detection of abundance of a specific codon sequence In a spe- 
cific codon position using a primer, which is complementing a codon and a framing 
sequence. Similar primers are used for all the codons and framing sequences. For 
each codon sequence utilized to encode a specific BB at a specific codon position in 
the library a Q-PCR reaction Is perfonmed with an oligo complementary to the codon 

15 sequence in question as well as a short region up- or downstream of the codon re- 
gion which ensures extension of tiie primer in a PGR reaction only when annealed 
to the codon sequence in that specific codon position. The number of specific prim- 
ers and Q-PCR reactions needed to cover all codon sequences in all possible codon 
positions equals the number of codon sequences times tiie number of codon posi- 

20 tions. Thus, monitoring the abundance of 96 different codon sequences in 4 different 
positions can be performed in a single run on four 96 wells micro titre plates (as 
shown in Panel B) or a single 384 well plate on a suitable instrument This architec- 
ture allows for the decoding of a 8,6 *10^ library of different encoded molecules. 

25 Quantification is performed relative to the amount of fulMength PGR product ob- . 
tained in a parallel control reaction on the same input material perfonmed with the 
two external PGR primers Pr.1 + Pr, 2, Theoretically, a similar rate of accumulation 
of this control amplicon compared to the accumulation of a product utilizing la single 
codon -I- sequence specific primer would indicate a 100% dominance of jihis parficu- 

30 lar sequence In the position in queistion. . 

Altfiough the setups shown in Panel A and B employ a Taqman probe strategy, 
otiier detection systems (S YBR green. Molecular Beacons etc.) could be utilized. In 
theory, multiplex reactions employing up to 4 different fluorofors in tiie same reao- 
35 tion could increase throughput conespondingly. 
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An example of how a deconvolution process of a library of encoded molecules oc- 
curs Is described in the following. Imagine that at the end of a selection scheme a 
pool of 3 figahd families (and the corresponding coding Identifiers) are dominating 
5 the population and present at approx. the same concentration. Three different 
chemical entities are present in the first position of the encoded compounds, and 
each of these chemical entities are present in combination with one unique chemicaf 
entity out of 3 different chemical entities in position P2. Only one chemical entity in 
position 3 gives rise to active binders, whereas any of a 20% subset of chemical 

10 entities (e.g. determined by charge, size or other characteristics) is present in posi- 
tion 4. The outcome of the initial codon profile analysis would be: 3 codon se- 
quences are equally dominating in position PI, 3 other codon sequences in position 
P2. 1 unique codon sequence is dominant in P3 whereas somewhat similarly in- 
creased levels of 20% of the codon sequences (background levels of the remaining 

15 80% sequences) are seen in P4. In such cases it could be relevant to use an itera- 
tive Q-PCR ("IQ-PCR") strategy to perform a further deconvolution of a library after 
selection. Again with reference to the example above, by taking the PGR products 
from tiie 3 individual wells that contained primers giving the high yields in position 
PI. diluting the product appropriately and performing a second round of Q-PCR on 

20 each of these Identifier oligonucleotides separately, it would be possible to deduce 
which codon sequence(s) is prefen^ in P2 when a given codon sequence is pre- 
sent in PI. 

Fig. 9; This figure illustrates tiie possibility to combine structural informatio 
25 the chemical entities and the relative abundance when designing a new more fo- 
cused library. The structural infomnation about tiie chemical entities can be used at 
least in two ways. Rrst the similarities between the chemical entities in each position 
can be used to choose chemical entities to a new library, Secoridly, ttie;combination 
of ttie selected chemical entities can be analyzed to investigate possible patlem that 
30 generate potential ligands. This is especially useful if tfie binding site or tfie stmcture 
of a known ligand is known. Any type of structural analysis tool can be used ttiat 
generate infonDation about tfie structure of separate chemical entities or combina- 
tion of chemical entities (tiie potential binders). By combining tfiese three analysis 
approaches a more focused library can be generated ttiat potentially will contain 
35 more specific binders compare to background binders. This new focused library can 
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be used in another round of selection to reduce the diversity. This procedure can be 
repeated until the desired binders have been identified. — ^ . . 

Rgure 10. This figure shows how the combination of codon analysis and structural 
5 information can generate valuable information. This invention allows the perform- 
ance of structure activity relationship analysis (SAR) vyhere the relative abundance 
in the codon analysis will represent the activity parameter (e.g. ICso values) in the 
SAR measurements.. Pharmacophore models can t>e generated, focused libraries 
can be designed, certain follow up chemistry can be used and infonnation in the hit 
to lead process can be used. 

Rg. 1 1 shows an array detection system in which a single codon is detected. Initially 
a library of selected complexes (29), i.e. complexes comprised of tiie initial library, 
which display a certain property, is provided as disclosed above. The initial library of 
complexes is prepared from e.g. 100 codons and identifiers having 4 codons In se- 
quence, which tiieoretical gives a library of lO^complexes. The selected complexes 
are subjected to amplification to amplify the identifiers of the selected complexes 
and tfie amplification products are added to an array (30). The array (30) comprises 
probes (32) complementary to each of tfie codons of the identifiers (31). At hybridi- 
20 sation conditions tfie PGR products of tiie identifiers are annealed to the cognate 
probes of the array and in a suitable scanner the spatial position of the annealed 
probes are detected to eluddate the codons (33) of the identifier. The quantity of 
each codon may be measured to find codons abundant in more than one identifier 
and/or codons leading to encoded molecules w'rth high affinity. The information may 
25 be used for decoding of the encoded molecule of the complexes displaying the de- 
sired property or the information may be used for selection of building blocks, which 
is to be added in a next round of library formation. 

Rg. 12 discloses an array detection system for establishing codons pairs, i.e. 
codons in tfie vicinity of each otiier. Initially (as shown in tills example) a library of 
complexes is prepared from 100 different codons deposited on an identifier In a 
sequence of four, maldng the total amount of combinations possible 10". The initial 
library is subjected to a condition in order to select a sub-library (29) displaying a 
desired properly. The identifiers of the sub-library are amplified by a PGR reaction 
and the reaction product is added under hybridisation conditions to an anay (34). 
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The array is designed With probes (35) capable of deteding tv^ 
To cov^r all possible bbmbinations of a libra(y based on 100 different codons 10 
probes are needed, whidi is practically fe. with tfie current teclinolo^^ 

The detection of the codons may be conducted quantitatively, i.e. the relative abun- 
5 danceofeachofthecodonpairsmaybedetennined.Thedetedionontheanay 

maybe used to reconstruct the selected identifiers (36) as three overlapping codon 
pairdetections depictthe entire Identifier. In the event the same codon pair appears 
on morethan one Wentifier. the infonnation on.the relative abundance of each 

codon -pair maybe Use^d to decipher the sequence of codons of the selected identrfi- 
10 ers as it can be assumed that each codon pair of the same identifier appears in the 
same amounts in the PGR products added to th^ array. ^ 

Fig 13 discloses an anay for detecting codon pairs at spedfic codon positions. Ini- 
tially; a library of complexes comprising identifiers with framing sequences is pro- 

15 vided -nieframingsequenceisspedficforeadiposltionofthecodonsontheiden- 
Ufier Fourtimesmoreprebesonthemicroanaylsneededpereadicodonifthe 
position of the codons also should be deteded in the analysis which Is pradicaBy 
feasible with current tedinology. The position is deteded due to the framing se- 
quences next to each codon. The initial librery is subjeded to a seledion process to 

20 isolatecomplexes(37)havingadesiredproperty.-n,eselededcomplexesaream- 

pHfied by a PGR readion and the readion produds are added to an array (38). The 
array comprises probes capable of deteding codon pairs as wells as the framing 

sequences (40) between the codons. The filming sequence detemiines the position 
of th6 codon in the readion history. i.e. it.ls possible to dedud whid, diemical entrty 
25 that feaded at Whidi point In time 6f the synthesis history of the encoded niolecule. 
thus making it'possibi; to reconstrud the stwdure of the encoded mdecule. 

r The detedion of the codon paire may be conduded quantitatively. i.e. the relative 
' abundanceofeadiofthecodonpaiismaybedetennined.Thedetediononthe 
30 anaymaybeusedtoreconstrudtheseIededWentlfiers(41)asthreeoveriapping 

codon pair detedions depid the entire Identifier. In the event the same codon pair 
■ appeare on more than one identifier, the infomiation on the relative abundance of 

eadi codon pair maybe used to dedpher the sequence of codons of the seleded 
identifiersasRcanbeassumedthateachcodonpairofthesameidentifierappeais 

35 inthtesameamounJsinthePCRproductsaddedtothearray. 
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Fig. t4 shows an array detection system in which a single codon is detected. Initjally 
a library of selected complexes (42), I.e. complexes comprised of the initial library 
which display a certain property, is provided as disclosed above. The initial library of 
5 complexes is prepared from e.g. 100 codons and identifiers having 4 codons In se- 
quence, which theoretical gives a library of 10' complexes. The selecteid complexes 
are subjected to amplification to amplify the Identi'fiers of tfie selected complexes 
and the amplification products are ti-eated with suitable reagents to cut between the 
individuar codons (43). The individual codon is tiie applied to..the anray. The array 

10 (44) comprises probes (45) complementary to each of the codons of the identifiers 
(46). At hybridisation conditions flie PGR products of the Identifiers are annealed to 
the cognate probes of the anay and in a suitable scanner the spatial position of ttie 
annealed probes are detected to elucidate tiie codons (47) of tiie identifier. The 
quantity of each codon may be measured to find codons abundant in more than one 

15 Identifier and/or codons leading to encoded molecules witfi high affinity. The infor- 
mation may be used for decoding of flie encoded molecule of the complexes dis- 
playing the desired property or ttie infonmation may be used for selecti*on of building 
blocks, which is to be added In a next round of library formation. 

20 Rg. 1 5 discloses a metiiod for selection of a suitable complex In several steps. In a 
first step tiie library of complexes 1 1s provided. Each member of tfie library com- 
prises an encoded molecule 2 composed of four chemical entities which is attached 
to an identifier oligonucleotide 3, which comprises four codons. The initial library 
shown comprises tiiree complexes. In a second step the library of complexes is 

25 incubated with imrnobifized target molecules 4. The encoded molecule having an 
affin^ towards the target mplecuie is bound to the immobilized target whereas en- 
coded molecules not having affinity towards the target under tiie conditions used 
remains in the liquid media. The complexes remaining in the Tiquid media are dis- 
carded by a washing process, while the bound complexes remain attached to tiie 

30 immpbiTized target molecules. The washing process is usually conducted using mild 
stringency conditions in the Initial rounds of selech'on. In latec stage selections the 
worthing stiingency conditions are usually increased to allow only high affinity bind- 
ers to remain attached to tiie target Subsequent to the washing step tiie complexes 
having affinity towards the target molecule are recovered. The recovery process 

35 usually requires high stringency conditions to detach the encoded molecule from the 
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Immobilized the target The selected sub-library resulting from the elution is sub- 
jected to an amplification process. The amplification of the identifier nucleic acid 
sequence of the selected complexes is usually perfonmed using the PGR method. 
Preferably, a modification of the PCR method is followed such that a biotin molecule 
5 is attached to one of the primers to obtain a handle for subsequent immobilization. 
The result of the amplification step is multiple copies of the identifier nucleic acid 
sequences, which codes for the encoded molecules which have survived the selec- 
tion step. 

10 Fig. 16 discloses an enrichment process of building blocks. The building blocks can 
be used for generation of a new library. Initially, identifier nucleic acW sequences are 
immobilized on solid support In one aspect of the invention the identifier nucleic 
acid sequences are the product of the selection procedure described in Fig. 1 . Each 
codon of the identifier nucleic acid sequence is identified with an uppercase letter, 

15 i.e. A, B, C. or D. The immobilized identifier acid sequences are contacted with the 
pool of building blocks under hybridisation conditions. Each of the buflding blocks 
are illustrated with an sequence complementary to a codon which may or may nor 
be present on the identifier nucleic acid sequence. The complementary sequences 
are indicated with a apostrophe, e.g. A\ B\ etc. The transferable chernical entity of a 

20 buflding block is illustrated with a lowercase letter. The condrtions providing for hy- 
bridisation of the complementing sequences of the pool of building blocks to the 
immobifised identifier nucleic acid sequence are preferably such that cognate nu- 
cleic acid sequences are hybridised to each other while sequences not recognizing 
any immobilized sequence remain in aque^ous media. The immobilized sequences of 

25 the Wentifier nucleic add sequences are thus used as bait in catching building 
blocka with complementing sequences. Following the incubation step, non-binding 
buflding bocks are removed by washing, whereby the part of the pool of building 
blocks not being able to find a complementing sequence is discarded. The building 
blocks attached to the immbbflized nucleic acW sequences ai-e detached using de- 

30 hybridisaton conditions. The diminished pool of building blocks may be used In a 
subsequent round for preparing a new library of complexes, in which the encoded 
molecule comprises a reaction product comprising additions from chemical entities 
attached to the enriched building blocks. Because tiie order of building blocks which 
have participated in the fomnation of the encoded molecules successful in tiie selec- 

35 tion,pr6cedure, is not preserved by tiie methpd for enriching building blocks a 
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scrambling of the encoded molecules may be obtained In some of the methods de- 
scribed herein for obtaining a library of comolexes. In some applications of the K- 

.... ^• 

brary it will be an advantage to have a scraj^ibling of the building blocks because 
and increased diversity is obtained. 

Fig. 17 discloses a method for reducing the diversity of the library of complexes re- 
sulting from the method descn'bed in Rg. 16. In some of the applications of the li- 
brary the diversity induced by scrambling of the building blocks are not desired. In a 
first step the sequences complementary to the identifier acid sequences used In Rg. 
16 are provided and immobilized on a suitable solid support. ]n one aspect of the 
Invention tfie complementary sequence Is obtained firom the PGR product resulting 
from tine method according to Rg. 1 5. Alternatively, tiie complementing sequence 
may be obtained by extending tiie identifier nucleic acid sequence using a suitable 
primer, optionally attached to a handle such as a biotin or dinitrophenol. In a second 
step the immobirized complementary sequence is incubated with flie scrambled li- 
brary under conditions, which provide for hybridisation between tiie complementary 
sequence and members of the library having affinity towards this sequence. Mem- 
bers of the library not having affinity to the complementary sequences remains in the 
media and is discarded, while members of the library being able to hybridise to tiie 
Immobilized nucleic add sequences is recovered. Occasionally, nucleic adds not 
perfectiy matching witii the complementary sequence immobirized on the solid sup- 
port are caught In one aspect of the invention tiie hybridisation products, prior to the 
recovery step, are treated with an enzyme capable of recognizing mismatching nu- 
cleotides and cleaving the double stranded helix in which they are situated. An ex- 
ample of an enzyme with this ability is T4 endonuclease VII. After the treatment wffli 
the enzyme, complexes displaying a hybridisation toward thejmmobilized sequence 
are eluted under dehybridisation conditions. Nucleotide sequences remaining from 
the cleavage by tiie enzyme will also be present in the new library, however, these 
sequences will not have any effect of a subsequent selection because no molecule 
is attached thereto. 
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Examples 

Example 1 . Enrichment of nucleic acid fragments 
A codon was included in the oligonucleotide sequence shown below. The codon is 
underlined and the boldface sequences represent the "framing" regiore next to each 
5 codon. These framing regions can be used for specifyihg the position of each codon. 

Biotin.AATTCCGGMCATACTAGICMCATGA-3'(SEQIDNO:1) . 

This identifier oligonucleotide was immoMiziBd on streptavidin beads using standard 
10 protocols, ie. 600 pmol identifier ongonucleotide with 6'- dT biotin in 60 pi 100 mM 
Mes pH 6.0 was mix with 50 pi SA-magnetic beads (Roche). The rhixture was 
washed 2-3 times with 100 mM MES pH 6.0 to remove non-bound identifier oligonu- 
cleotides. To reduce baclcground binding, the oligos and beads was incubated at RT 
for 10 min on shaker, then incubated on ice for 10 min while rotating the tube. R- 
15 nally. the sample was washed with 100 mM MES 4 times in 800 pi at 60*C. 

In the case where a PGR product is Immobilized, the complementing (non-sense) 
strand is removed using 10 mM NaOH. This will generate single-stranded DMA with 
the selected codons. The same procedure descnljed In this example can be used 
20 for a collection of different identifier nucleic acid molecules that contain one or more 
codons. The codons in the identifier nucleic acid molecules can be the same or dif- 
ferent detemiined from the enrichment perfonned on the initial library. 

The immobilized identifier nucleic acid molecule was mbced with the pool of nucleic 
26 acid fragments shown below. This pool of fragments illustrates an original pool that 
was used for generating an initial library of complexes. Each fragment may possess 
in the 3'-end a specific chemical entity tfiat is encoded by the codon sequence. 
These nucleic acid fragments contain a Specific sequence in the codon region (un- 
deriined) while the framing region shown in boldface Is identical among the frag- 
30 ments. Thus, the pool of fragments represents different codons In the same position 
of the identifier nucleic acid. 



1. 



CGT GTG ATC GAA CTC GTG TG GTA TGATCAGTTGT ACT-5' 
(SEQIDN0:2) 
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CGT GTG ATC GAA CTC GTG TG GTA TCTAGTCGGTT ACT-5' 

(SEQIDN0:3) — 

CGT GTG ATC GAA CTC GTG TG GTA TTCGAGTGTTT ACT-5' 
(SEQIDN0:4) 

CGT GTG ATC GAA CTC GTG TG GTATAGCTCATGGTACT-S* 
(SEQIDN0:5) 



The nucleic acid fragments are mixed with the immobilized identifier nucleic acid 
using 600 pmol of each nucleic acid fragment mixed with the immobilized identifier 

1 0 nucleic acid molecules (1 00 mM MES pH 6.0. 1 50 mM NaGQ). The mixture was in- 
cubated at 25"C for 30 riiinutes in a shaker. The non4iybridized fragments were 
removed by 4 times washing in 800 pllGO mM MES. 150 mM NaCI. This step 
should separate the complementing fragments (bound) encoding for the select 
chemical entities from the non-complementing fragments (non-bound) encoding for 

1 5 chemical entities that were not effective in the preceding selection process. The an- 
nealed fragments are eluted from the immobilized identifier nucleic add molecules 
by re-suspending the beads in 25 pi 60"C H2O and incubating for 2 min at eO'C. The 
enriched fragments were purified on a micro-spin gel filtration column (BiRad). 
The eluted fragments were prepared for mass spectroscopy (MS) analysis by mixing 

20 In half volume of Ion exchanger resin and incubating minimum 2 h at 25"C on a 
shaker. After incubation the resin was removed by centrifugation and 1 5 pi of the 
supematant was mixed with 7 pi of water, 2 pi of piperidine and imidazole (each 625 
mM) and 24 acetonitrile. The sample was analysed using a Mass Spectroscopy 
instrument (Bruker Daltonics. Esquire 3000plus). The result for the MS analysis Is 

25 shown below. 
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The mass from the correct complementary fragment (number 1) is obtained in the 
MS analysis (11438.39. expected 11439 L«)^No masses for the other fragments 
(number 2-4) could not be found in the MS spectra (expected masses; 1 1415, 
5 1 1430, 1 1424 Da). This result shows that the right fragment is'strongly enriched and 
other fragments with the wrong codon sequences are removed. The enrichment is 
possible even when the "spacing" region (boldface) is identical in each fragment 

Two control experiments were also performed to validate the enrichnrjent protocoL In 
10 the first experiment, the fragment with the correct codon sequence (number 1) was 
mbced v^nth the Immobinzed identifier molecule as described above. The sample was 
washed end eiuted also as described above and prepared for MS analysis. The re- 
sult from the MS analysis is shown below, 



15 
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20 



The result indicates that the fragment with the correct sequence (number 1) anneals 
to the immobifized identifier molecules and is eiuted under the conditions used in 
this example. The expected mass (1 1439) correlate well with the experimental 
mass, 11438.39 Da. 



25 



In the other confrol experiment, a fragment with a wrong codon sequence (number 
3) was allow to bind to the immobiTized identifier molecule as described above. 
Again, the eiuted sample was prepared and analysed with MS. The result is shovyn 
below. 
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Abooluta Belatlw 

JUniadaiice Abundance 

8663.-17 IK - HI- 89308 96.60 -^^.' 

6616.52 (M - HJ' 41963 45.39 - 

8678.68 (M - Hi- 136S1 14.77- . 



In this experiment, no mass was found that corresponded to the expected mass 
(1 1430) oiF the tested building block (number 1). Again, this shows that fragments 
5 with a anticodon sequence different from the enriched codons In the identifier nu- 
cleic acid molecules are not captured using this approach. 

The enriched fragments obtained using this strategy may then be used to generate 
a new library of encoded molecules. This new library will contain encoded molecules 
10 composed of the enriched chemical entities. Thus, the library size have been re- 
duced due to the removal of chemical entities not involved in binding encoded mole- 
cules, and enriched in cheniical entitles that are highly represented in the encoded 
molecules which binds to the target molecule. 

15 Example 1 shows the possibility of enriching for specific building block molecules, 
i.e. nucleic acid fragments associated with transferable chemical entities. The same 
procedure can be used for a larger pool of building block than four as used herein. 
The codon design will determine the maximum numt^er of building blocks that can 
be used. The sequence in the codon region should t>e large enough to allow dis- 

20 crimination in tiie annealing step. Various conditions can be used to increase the 
stringency in ttie annealing step. Parameters such as temperature, salt, pH, forma- 
mide corKentration, time and other conditions couki be used. 

Example 2 (modeO: Multiple codon selection In a library. 
25 This example describes tiie enrichment of building blocks using an identifier nucleic 
acid (identifier) molecule with mulfa'ple codoris. These codons encode for a displayed 
molecule (DM) that are attached to the identifier molecule before tiie selection is 



wo 2004/074429 PCT/DK20(W/Q00117 

61 . 

performed. The library she is determined botfi by the number of different chemical 
^ entities and the total number of chemical entities. The identifier molecule shown 
belovTcontiin^ displayed molecule 

are indicated with underlines and the region separating (framing region) the codons 
5 in boldface. The size of the codons can be varied dejDendent in the diversity need in 
the library and the optimal setup for chemical entity enrichment The framing region 
can also be varied dependent on the discrimination needed to distinguish tiie pre- 
cise position of a codon in the identifier molecule. The framing region will also be 
important for the generation of the library. This can be understood when the encod- 

10 ing is accomplished by extension of the encoding region as disclosed in DK PA 

2002 01965 and US 60/434.425, Incorporated herein by reference. There need to be 
a perfect match in the S'-end in order to get efficient extension with a polymerase or 
a ligase. The size of this spacing/franriing region should be long enough to fomi a 
complementing region to allow extension with a polymerase or ligase. Preferably. 

15 the spacing region should be between 3 and 6 nucleotides. The codon.region to- 
gether with the spacing region will also be useful when codons are to be identified 
using a micro anay setup. The identifier molecule witii the right codon sequences 
will hybridize to the array and be detected. 

20 The sequence below represents an enriched Identifier molecule attached to the dis- 
played molecule (DM). This Identifier molecule has been enriched due to ihe fact 
that the DM binds to the target molecule in tfie selection process. In practice, more 
than one enriched identifier molecules will be obtained when using a fibrary of dis- 
played molecules attached to its identifier sequence. 



25 



EM- 



GT 



30 This Identifier molecule is amplified with two primers (below) using a standani PGR t 
reaction. For example, 500 nM of each primer, 2.5 units Taq polymerase, 0.2 mM of 
each NTP, in a PGR buffer (50 mM KGI, 10 mM Tris-CI. 3 mM DTT. 15 mM MgGb. 
0,1 mg/ml BSA). Run 25 cycles (94''C melt for 30 seconds. 55*tt anneal for 45 sec- 
onds, 72**G extension for 60 seconds). 



35 



B-GCACACTAGCTTGAGCACACTGACA-3 ' 
CGAAATGCTAGGGCGTCCATTGGCA-5' 
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This will amprrfy the identifier molecule from the selection process and add a biotin 
in the 5'-end of one of the strand (below). This amplified product is then immobilized 
on a solid support, streptavidin beads for example. This can be perfomied identical 
as describe in example 1. 

When the identifier molecules have been immobilized and the excess has been re- 
moved by a washing step (as describe in example 1), the complementing non-sense 
stand is removed by incubating in 10 NaOH for about 2 min and washed with 100 
mM Ues buffer, pH 6.0. This procedure will generate the strand showri below where 
10 the codon regions are exposed to allow hybridization with the complementing se- 
quences. 

B- 

cnirarTAGCTTG&flCRCACTGACAC MGGAGATCaC^ 
15 GT 

The next step is to protect the complementing sequences outside the codons to pre- 
vent the binding of the building block to these sequences. This can be perfonned by 
adding "blocking" oligonucleotides that has a complementing sequence. This is 
20 shown below. 

B- • ■ 

GT 

25 CGTGTGATCGAftCTCGTGTGftCTGT CGAAATSCTAGGGCGTCCJkTT- 
GGCA 

Next, the pool of different building blocks is added and is allowed annealing to the 
codon region in the identifier region. The position of annealing is determined by the 

30 spacing region shown in boldface. The stringency is adjusted to only allow hybridlza- 
tioaof the correct building block in the right position. This can be acconipllshed by 
mi>5ng the right component together using various conditions. The condition can for 
example include the presence of salt, formamide and various buffers adjusted to 
suitable pH and temperature. Below Is the correct building block that will anneal to 

35 the enriched Identifier molecules. These building blocks Is annealed and eluted as 
described In example 1. 



. CE-K:GTGT<aTCGAACTCGTGTGACTGTGT2^CTCTAGTGTW: 
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TTie next pool of building blocks is blocked with an oligonucleotide that also protects 
the first codon. This.is necessary to prevent binding of the building blocks in that 
codon. 



5 



GCACACTAGCTTCaGCACACTGACACM^GAGAT^ 
GT . ■ •• 

CGTGTGATCGAACTCGTGTGACTGTGIMIIIIIIII CGftAMGCTAGGGCGTCCATT- 



10 

Again, the library of building blocks is added to enrich for the selected codons. Be- 
low fe the building block with the correct sequence. These building blocks is an- 
nealed and eluted as described In example 1. 

15 CE— CGTGTGATCGAACTCGTGTGACTGTeaMIIIIIIIITACGWi^ 

Rnally, the identifier molecule is protected with a blocking oligo that expose only the 
last codon. 



20 B- - 

■ ■ GT 

CGTGTGAT^ 
GGCA 

25 



ATCGAACTCGTGTGaCTGTCTAIllIIlIIiaaCIIimill CGAAATGCTAGGGCGTCCATT- 



30 



A new pool of building blocks is added and allowed hybridizing to the identifier mole- 
cule. These building blocks is annealed and eluted as described in example 1 . 

C8--CGTGTGATCGAACTCGTGTGMriOTO!MlIIIIIIITaClIIIlIIIIJ^^ 

The enrichment of each library of building blocks are perfomned in separate tubes in 
oixier to keep the libraries of building block separated. The enrichment is perfomied • 
with building blocks loaded with chemical entities (CE), 
t 

35 Example 3 - Template versus chemica l entity evolution 

The graph below illustrates tiie relationship between the number of chemical entities 
and the library size. The example below is calculated on ttiat tiie final encoded 
molecules contains four chemical entities tiiat is individually encoded by the cohb- 
sponding building block (n^ where n is tiie number of building blocks). The graph 

40 shows that the diversity decreases dramatically witfi tiie reduction of the total num- 
ber of building blocks. If the number of different building can be reduced to about 20- 
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30 (library size of 16*10^ and 81*10^ respfsctively) in tlie selection process, then the 
library size for the final round of selection ''^ !ofW enough for identification of the bind- 
ing molecules. 
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When the same analysis is perfomned on a protein another situation is obtained. The 
example shown below is on a very small protein (50 amino acids in length). The 
diversity is enormous when all amino acids are included in the library. The size of 
the library is also decreasing with the total numlser of amino acids, but not to the 
same extent as show above for a small molecule. Even when the different amino 
acids are reduced to 2, the library size Is huge (1.2 10^^. This shows that amino 
acid enrichment is impossible on protein. This is even more pronounced for mid-size 
protein which contains about 300 amino adds. 
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Example 4 - Codon analysis 
this example illustrates one possibility to perform codon analysis on a whole popu- 
lation of different identifier oligonucleotides. Ihe analysis can also be perfonned 
5 using array where the probe oligonucleotides (complementary to the codons) are 
Immobilized in discreet areas and the signal is monitored dependent on the amount 
of identifiers oligonucleotides are hybridised in each specific area. Codon analysis 
can also be perfomied using standard sequencing using a polymerase extension 
step. 

10 

In fig. 5. Four codons are shown (PI through P4; bold pattern) along with flanWng 
regions (light pattern), A universal Taqman probe anneals to a region adjacent to the 
codon region, but within the amplicon defined by the universal PGR primers Pr.1 
and Pr. 2. These primers could be the same as used for amplificatiori of the identifier 

15 oligonucleotides encoding binders after an enrichment process on a specific target 
However, are minimal length identifiers ptieferred during the encoding process, the 
region involved In Taqman probe annealing could be appended to tiie library identi- 
fier oligonucleotides by e.g. overiap PGR, ligation, or by employing a long down- 
stream PGR primer containing ttie necessary sequences. The added length corre- 

20 spending to ttie region necessary for annealing of the Taqman probe would be fonn 
20 to 40 nts depending on the type of TaqiWan probe and Ta of the PGR primers. 
The Q-PGR reactions are preferably perfomied in a 96- or 384-well format on a real- 
time PGR tiiermocycling machine. 

25 Rg. 5. panel A, shows tiie detection of abundance of a specific codon sequence in 
position one. Similar primers are prepared for all codon sequences. For each codon 
sequence utilized to encode a specific BB in ttie library a Q-PGR reaction is per- 
formed vnth a prinner oligonucleotide complementary to the codon sequence In 
question. A downstream universal reverse primer Pr. 2 is provided after ttie Taqman 

30 probe to provide for an exponential amplification of ttie PGR amplicon. The setup is 
most suited for cases where the codon constitutes a lengtti corresponding to a 
lengtti suitable for a PGR primer. 

Fig. 5, panel B shows the detection of abundance of a specific codon sequence in a 
35 specific codon position using a primer which is complementing a codon and a fram- 
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ing sequence. Similar primers are used for all the codons and framing sequences; 
For each codon sequenceaJtilizedio..encQcLe_a3pedfic M_aJt.a.s^^ posi- 
tion in the library a Q-PCR reaction is perfonned with an oligo complementary to the 
codon sequence in question as well as a short region up- or downstream of the 
codon region which ensures extension of the primer in a PGR reaction only when 
annealed to the codon sequence in that specific codon position. The number of spe- 
cific primers and Q-PCR reactions needed to cover all codon sequences in ai! pos- 
sible codon positions equals the number of codon sequences times the number of 
codon positions. Thus, monitoring the abundance of 96 different codon sequences 
in 4 different positions can be perfonned in a single mn on four 96 wells micro titre 
plates (as shown m Rg. 5, panel B) or a single 384 well plate on a suitable instru- 
ment This architecture allows for the decoding of a 8,5 *1 0^ library of different en- 
coded molecules. 

1 5 Quantification is periormed relative to the amount of fulMength PGR product ob- 
tained in a parallel control reaction on the same input material perfonmed with the 
two external PGR primers Pr.l + Pr, 2. Theoretically, a similar rate of accumulation 
of this control amplicon compared to the accumulation of a product utiRzing a single 
codon + sequence specific primer would indicate a 100% dominance of this particu- 

20 lar sequence in the position in question. 

Although the setups shown in Fig, 5, panel A and B employ a Taqman probe strat- 
egy, other detection systems (SYBR green, Molecular Beacons etc.) could be util- 
ized. In theory, multiplex reactions employing up to 4 different fluorofors in the same 
25 reaction could increase throughput conespondingly. 

An example of how a deconvolution process of a library of encoded molecules oc- 
curs is described in the following. Imagine tfiat at the end of^a selection scheme a 
pool of 3 ligand families (and the con-esponding coding identifiers) are dominating 

30 the population and present at approx the same concentration; Three different 
chemical entities are present iri the first position of fhe encoded compounds, and 
each of these chemical entities are present in combination with one unique chemical 
entity out of 3 different chemical entities iri position P2. Only one chemical entity in 
position 3 gives rise to active binders, whereas any of a 20% subset of chemical 

35 entities (e.flf. detenmined by charge, size or other characteristica) are present In posi- 
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tion 4. The outcome of the initial codon profile analysis would be: 3 codon se- 
quences are equally dominating in posltidft PI , 3 other codon sequences in position 
P2. 1 unique codon sequence is dominant in P3 whereas somewhat similarly in- 
creased levels of 20% of the codon sequences (background levels of the remaining 
5 80% sequences) are seen in P4, In $uch cases it could be relevant to use an itera- 
tive Q-PCR flQ-PCR") strategy to perfonm a further deconvplution of a library after 
selection. Again with reference to tfie example above, by takinpthe PGR products 
from the 3 individual wells that contained primers giving tiie high yields in position 
PI, diluting tiie product appropriately and perfonning a second round of Q-PCR on 
10 each of these identifier oligonucleotides separately, it would be possible to deduce 
which codon sequence(s) Is prefenred in P2 when a given codon sequence Is pre- 
sent in PI. 
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Synthesis of identifier Qliaonucleotides: 

The 10 identifier oligonucleotides were ass^jlibled in 10 seperate 50 \i\ PGR reactions 
each containing 0.05 pmol of the oligos Q-Jempi-X, Q-Temp2. Q-Temp3-X and Q- 
Temp4 (x=1 through 10) and 25 pmol of the external primers FPv2 and RPv2 with TA= 
5 53*C. The 1 60 bp products were gel-purified using QIAquick Gel Extraction Kit from 
QIAGEN (Cat No. 28706) and quantified on spectrophotometer. As a control, 20 ng of 
each of the identifiers (as estimated firom these measurements) were loaded on an 
agarose gel, 

10 Preparation of samples for Q-PCR: 

Sample A: Generated by mixing 20 ng from each identifier oligonucleotide prep. Vol- 
ume was adjusted to 50 pi. Concentration: 4 ng/pl = 38.46 finol/pl (160bp x 650 Da/bp 
=1.04x105 g/mol. 1 ng= 9.615 fmol). Diluted to 10^ copies/5pl (0.00332 fmol/pl). 

15 Sample B: 20 ng/20pi stocks of each identifier were prepared. The sample was mixed 
as follows: 

5plundil.ldentrtier#10 

5pl2xdiL Identifier #9 

5pl4xdil. Identifier #8 
20 5pl8xdil. Identifier #7 

SpMOxdiL Identifier #6 

5pl32xdilJdentifier#5 

5pl 64xdil. Identifier #4 

6Ml128xdil. Identifier #3 
25 5yl256xdil. Identifier #2 

6pl 512x dil. Identifier #1 

Concentration: 1 0ng/50pl= 0.20 ng/pl = 1 .923 fmoi/|jl. Diluted 579.2-fold to 10^ cbp- 
ies/5pl (0.00332 fmol/pl). 

30 Standard cun/e: The samples for tiie standard cun^e was prepared by diluting Sample 
A 1 16.55-fold to 10' copies/5 pi (0.33 fmol//pl) and subsequentiy perfonning a lO-foW 
serial dilution of this sample. 5 pi was used for each PCR reaction. The standard cun/e 
is sh(^ in Rg. 2. 



35 Q-PGR reactions 
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For 5 ml premix (for one 96-welI plate): 

2,5 ml Taqman Universal PGR Master Mix (Applied Biosystems; includes Taq poly- 
merase, dNTPs and optimized Taq pol. buffer) 
450plRPv2(10pmo!/ul) 
5 25 |J| Taqman probe (6-FAM-TCCAGCTTCTAGGAAGAC-MGBNFQ; 50 pM; Applied 
Biosystems) 
1075plH2O 

40.5 pi premix was aliquoted into each well and 4.5 pi of relevant upsti-eam PGR primer 
10 (FPv2 (for standard curve) or one of the codon specific primers listed below; 10 

pmol/pl) and 5 pi sample (H20 in wells for negative controls) was added. The codon- 
specific PGR primers were: (Tm calculations shown are from Vector NTI; matched to 
TmforRPv2(67.7X)) 

15 P1-1: GTCATAGtAGCTGCTAGAGATGTGGTGATA 66.8'C 

P1-2: CATACGGAAGAAGACAGAAGAGCTGATA 67.8'G 

P1^: TGATAGTGAGGAGTCGAGAAGTGAAGATA 67.6'*G 

PI -4: CATAGTGTGTAGGTCAAGAGGTCAGATA 67.4'G 

P1-5: GATACTGTGGAACTACCATCCAAGGATA 68.0'C 
20 P1-6: CCATCCAACATCGTTGGAAGAT 67,8'C 

P1-7: CATAGAAGGTGTCGTGTGAGATCTGATA 67.7''G 

P1-8: ATAGTGAGGAAGCTGGATGATGAGATA 67,3X 

P1-9: GATAGTAGGATGGATCGAAGGTAGGATA 68.rG 

P1-10:TCATACTCGAAGCTACTGTCGAGATGATA 68.2'C 



25 P2-1: ATATTAGTGTGTGACGATGGTACGCA 67.8^0 
P3-1: AGAAGTAGGAACGTGGATGAGAGA 67J*C 

P4-1: CGAGGAGGAGGTGGAAGCT 67.7X' 
P4-2: TGGAGGAGTGGAGGTGGA 68.3X 
P4-3: GCTTCCTCTGCTGCACCA 66.7X 

30 P4-4: GGTGTCGAGGTGAGCAGCA 69.rC 

P4-5: GGAGGAGGTGGATCCTGGT 68.6'C 
P4-6: GTGAGGAGGAGGTCGTGGTGT 68.0X 

P4-7: GTGAGAGTGGTGGTGGTGGA 68.8X 

P4-8: CATGTCGAGGACCTGCTCCT 67.9*C 

P4-9: AGGAGGTGTGGAGTGGTGGA eS.S'C 
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P4-10: ACTGAGCTGCTCCTCCAGGT 
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Thermocycling/measurement of fluoresence was performed on an Applied Biosystems 
ABI Prism 7900HT real-time instrument utilizing the standard cycling parameters: 
5 95X10mln; 
40 cycles of 
95"C15sec; 
60*C 1 miri 

10 All samples were run in duplicate. 

Results 

Fig. 6 shows the standard curve calculated by the 7900HT system software. The log of 
the starting copy number was plotted ^gainst the measured Cj value. The relationship 
15 between Ct and starting copy number was fineiar in the range from 10 to 10" identifier 
copies. 

This standard cun/e was utilized by the system software to calculate the quantity In the 
"unknown" samples as shown below. 
20 ' . 

Table I: Sample A (Shown graphically in Rg. 7) 

Sample A: 

Equimolar 



ratios 


Observed A 


Observied B 


Expected 


FPv2 


12539947.00 


11977503,00 


10000000 


P1-1 


445841.90 


480382.03 


1000000 


P1-2 


884840.70 


847478.56 


1000000 


Pl-^ 


1013073,56 


948770.00 


MOOOGOO 


P1-4 


764187.94 


741304,40 


*1000000 


P1-5 


1352874.60 


1275155,50 


1000000 


P1-6 


1284075.60 


1337928.50 


1000000 


P1-7 


658161.80 


747371,66 


1000000 


P1-6 


742187.20 


653874.00 


1000000 


P1-9 


824587,75 


705785.75 


1000000 


P1-10 


813550,75 . 


836037.90 


1000000 
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P2-1 


13145159,00 


14482606.00 10000000 


P3-1 


13263911.00 


12773780.00 10000000 


P4-1 


1430704,80 


1 472570, 


1 uuuuuu 


P4-2 


2681652,00 


2481o24,oU 


1 uuuuuu 


P4-3 


1933106,80 


n/%Of Af\ 

2085476.40 


1 UUUUUU 


P4-4 


1359684.40 


1364621,40 


looopou 


P4-5 


2206709,80 


2065813,60 


1000000 


P4-6 


1652718,10 


1873777.20 


1000000 . 


P4-7 ■ 


1468208,10 


1416153,00 


1000000 


P4-8 


1664467,50 


1581067,00 


1000000 


P4-9 


1462520.60 


1594593.80 


1000000 


P4-10 


2020088,20 


1912277,40 


1000000 


Table II: Sample B (Shown graphically in Rg. 8) 


Sample B: 




Obsen/ed 




2-foId dil. 


Obsen/ed A 


e 


Expected 


FPy2 


4.97E+06 


5.05E+06 


10000000 


P1-1 


9955,07 


10899,97 


9766.625 


P1-2 


12732,32 


13469.12 


19531.25 


P1-3 


25542,8 


25419,85 


39062,5 


P1-4 


34748,89 


44070,81 


78125 


P1-5 


110881,41 


123734.13 


156250 


P1-6 


163687,44 


166220.5 


312500 


P1-7 


156993.81 


172005,64 


625000 


P1-8 


343176,78 


374809,13 


1250000 


P1-9 


646619.44 


576151 


2500000 


P1-10 


1,49E+06 


ij2E+oa : 


5000000 


P2-1 


5.19E+06 


5,37E+06 * 


10000000 


P3.1 


5.29E+06 


5.09E-I-06 


10000000 


P4-1 


(no signal) 


70223.8 


9765,625 


P4-2 


,42103,32 


22733.17 


19531,25 


P4-3 


54480,62 


39663.62 


39p62.i5 


P4-4 


51293.07 


43950,9 


78125 


P4-5 


137946.95. 


115027.34 


156250 
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P4-6 


174134.64 


156442.55 


.312600 


P4-7 


- 31 6505,78 - 


- 283856. ??=1V. 625000- 


P4-8 


737661,44 


691296,7^' 


1250000 


P4-9 


1.42E+06 


1,45E+06 


2500000 


P4-10 


3.72E+06 


3.52E+06 


5000000 



The results of the experiments show the possibility of accurately quantification of identi- 
fier oligonucleotides down to or even below 10 copies with a 9 fold dynamic range, and 
5 reliable relative quantification of the tested codons in various positions in the identifier 
oligonucleotide. 

Example 5 - Codon analysis 
Another possibility to analyse codons in identifier oligonucleotides Is to use array for- 

10 mat with attached probe oligonucleotides. 

Six adaptors with the different anti-codon sequences in all three positions were de- 
signefd. All the adaptors contain a probe binding sequence (20 nucleotides) that allows 
discrete binding on the microan-ay. Probe design is known in the art. Adaptors harbour- 
ing one to three deletions in the spacing region were used as negative controls to en- 

15 sure that only the framing region is responsible for the hybridization of the identifier. 
Thus, tiie negative controls contain another framing sequence. The Identifier oligonu- 
deotide harbours tiie complementing codon sequence and the position directing fram- 
ing regions. 

20 Adaptor oligonucleotides 

3 ' CTCATCGGAAGGGCTC6TAACG GTGGGTTTGGG< 3G CTGGGTTT6GGG C GTGGGTTT6GGC GG- 
S' 

3 ' TTTGGTAGCTGAGTGCCCTAGG GTGGGTTTGGGC G GTGGGTTTGGGG G CTGGGTTTGGGG CG"^ 
5' 

25 a^TAACTGGTTTGACGCCACGCGC GTGGGTTTGGGG C GTGGGTTTGGGC G GTGGGTTTGGGG GC" 

5' • . ■ • ■ . • 

3 ' TAATTGAGCTGACGGCGGACGG CTGGGTTTGGGCGTGGGtTTGGGGCTGGGTTTGGG GCG-S^ 
3^ TGTTGCTACTCTGGCCCGAGG CTGGGTTTGGGCTGGGTTTGGGCTGGGTTTGGG GCG-S ^ 
3 ' ACGGGATAACAACGCAGGCTGG CTGGGTTTGGGTGGGTTTGGGTGGGTTTGGG GCG-S ' 
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Identifier ODgonucleofide 
Biotin-5 ' GCCACCCAAftCCCCCG 

GenFlex hybridisation and scanning. Prior to hyl)ridlzation. the Adaptor mix (100 pM 
5 final concentration for each of the adaptor oHgonucleotides) in a hybridization buffer 
(lOOmM MES. 1 M NaCI. 20 mM EDTA, 0.01% Tween 20, M Denhardt's). was heated 
to 95'C for 5 min and subsequentiy cooled and maintained at 40°C for 5 min before 
loading onto the Affymetrix GenFlex protje anay cartridge. The probe array was then 
incubated for 2h at 45°C at constant rotation (60 rpm). The remaining Adaptor mix was 
10 removed from the GenFlex cartridge, and replaced with the identifier In a hybridization 
buffer (lOOmM MES. 1 M NaCI. 20 mM EDTA. 0.01% Tween 20. 1x Denhardt's). Tbe 
identifier hybridisation mix was heated to 95°C for 5 min and subsequently cooled and 
maintained at 40<'C for 5 min before loading onto the Aflymetrix GenRex probe anay 
cartridge and hybridised for 2h at 45«C at constant rotation (60 rpm). The washing and 
15 staining procedure was perfonned in the Aflymetrix Fluidics Statioa "me probe anay was 
exposed to 2 washes in 6xSSPE-T at 25''C followed by 12 washes in 0.5xSSPE-T at 
40^. The biotinylated Identifier oligonucleotide was stained with a streptavidin- 
phycoerythrin conjugate, final concentration 2 ng/jd (Molecular Probes, Eugene. OR) in 
6XSSPE-T for 10 min at 25»C followed by 6 washes in BxSSPE-T at 25^;. 

20 

The probe arrays were scanned at 560 nm using a confocal laser-scanning microscope 
with an argon ion laser as the excitation source (Hewlett Packard GeneAnay Scanner 
G2500A). The readings from the quantitative scanning were analysed by the Aflymetrix 
Gene Expression Analysis Software. The results are depicted in Scheme 1. 

25 ■ 
Scheme 1: 
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60 




Adaptor Adaptor Adaptor Adaptor Adaptor • Adaptor 
Otigol Oligo2 OHgo3 0(igo4 OligoS OUgo6 

Adaptor 



The Array analysis shows that the codons Including the framing regions are able to dis- 
tinguish between the different probe oligonucleotides. The designed probes will only de- 
tect codons with the correct framing region allowing distinguishing first of the right codon 
5 and secondly as to which position the codon is positioned. Only one deletion in both 

fr'aming regions reduces significantly the hybridization of the identifier. Thus, the framing 
sequence may be used to obtain information about the position of a specific codon and 
the point in the reaction history when a given reaction of a chemical entity has occun'ed. 

10 

The infonnation obtained in this example using either QPCR or an^y codon analysis as 
example can be used to generate a new more focused library. The signal from the 
QPCR analysis or tfie array analysis can directiy be used to combine preferable chemi- 
cal entities. 



EXAMPLE 6. Generation of a second-generation library. ' 
The infonnation obtained from a codon analysis perfomied according to the principles 
described in Examples 4 or 5 can be utifized for assembly a new more focused library. 
20 Sequence information can also be used to design a second-generation library with re- 
duced diversity- This example illustrates how sequence data can be utilized to make a 
more focused library witii the enriched chemical entities. Identical strategy can be 
based on tiie codon analysis methods described in Examples 4 or 5. 
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A 700-fnember library was generated composing of 4 x 25 x 7 chemical entities. The 
library generation protocol is described below with the seq uence i nfonnation and 
chemical entity stRJCture. 

General aaangement of each complex composed of display molecule and identifier 
; oligonucleotide in the library generation: 

o 



15 



20 



HN-Ra-^ Oligo/a OligoBx ongoCx' 

HN- y — — 



— — — ■ 5' 

Rb or^o ax Oligo bx Oligo cx Oligo d 

HN 

>-Rc-NH2 

Specific codons in each oligo (Ax. Bx, Cx) was used and can be designed by using a 
1 0 specific nucleotide sequence for each chemical entity. In this particular setup, two 
complementary oligonucleotides {e.g. oligo Ax and ofigo ax) containing a particular 
codon are allow to hybridfee before tiie ligation step. The ligation of each codon oB- 
gonucleotide in each position is ligated witti tfiat attachment of the encoded chemical 
entity. 



Overview of fhe library oe naratidn omcedum: 
First round of library oep ftration f round M : 

— - ^ BufldlngbIod<Ax(BBA>0 



Pnt-I 



H2N-^-template 



.9!ii Pnt-N-M; H2N-fV< 

HN-5*-template ^ HN-ff-tempbte 



25 



•Pnf cbnesponds to pentenoyi - an amine protecting group. 'R" can by any molecule 
fragment The chemical used In library generation comprise a prirhary (shown) or a 
secondary amine. 
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Second round of library generation Yround : 
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Building block Bx(BBgx) 



HN-I 



HN-S'-tenv)late 



z 



HN-ffMemptato 



NH2 



Third round of library oeneration Ground C): 



Pnt- 



,^ Bund>i&bloGkCx(B8cX) 



HN~5'<-ten^)Iate 



■Pnt 



HN-S"- 



HN 



10 



General procedure : Library generation, selection and mismatch subsequent selection 



15 



First round of library generation (round A) : 

Rrst oligonucleotides of the A series are each modified by. adding to each type of oligo 
a small molecule building block (BBax) to the 5' amine forming an amide bond After 
this step the identifier Is comprised of oligo Ax. 



Second round of library generathh (round B) : 

4 nmol of a mixture of different modified A ofigos are then split Into a number tubes 
20 corresponding to the number of different building blocks to be used In round B. 1 90 

pmol Oligo a and 2 pi heering DNA is added to each tube and the DNA material in each 
tube is lyophilized. The lyophilized DNA is then redlssolved in 50 pi water and purified 
by spining through Biospin P-6 columns (Biorad) equilibrated with water. 



25 



Addition of building block 
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The DMA material in each tube is again tyophilized and redissolved in 2 pi 1 00 mM Na- 
borate pH 8.0/100 mU suffo N-hydroxy succinimide (sNHS), For each tube 10 ]i\ build- 
ing block BBbX (100 mM in dimethyl sutfoxiae [DMSO]) is preactivated by mixing with 
10 \i\ 1.Ethyi-3-{3-dimethylaminopropyl)-carbodfimide (EDC) (90 mM in dimethylfor- 
5 mamide [DMFD and incubating at 30°C for 30 min. 3 pi of this preactivated mixture is 
then mixed with the 2 pi in each tube and allowed to react 45 min at 30 °C. Then an 
additional 3 pi freshly preactivated BB is added and the reaction is allowed to proceed 
for 45 min at 30 **C. The resulting mixture is then purified by spinning through Bio-Rad 
P6 DG (Desalting gel). 

10 

Addition of codon o!igonudeotide 

The DNA material Is then lyophilfeed and redissolved In 10 pi water containing 200 
pmol oligo Bx (eg. B1) and the con-esponding oligo bx (eg. b1). This is done so that the 
codon in oligo Bx identifies the BBbX added to the DNA identifier. 10 units of T4 DNA 

15 ligase (Promega) and 1 .2 pi T4 DNA ligase buffer is then added to each tube and the 
mixture is incubated at 20»C for t hour. The DNAn identifier linked to the small mole- 
cules now comprises an Ax oligo with a Bx oligo ligated to its 3* end. The reactions are 
then pooled, an approplate volume of water is allowed to evaporate and the remaining 
sample is purified by spining through Biospin P-6 columns (Biorad) equilibrated with 

20 water. 

Removal of building blocl( protecting group 

The pooled sample (r 50 pi) is adjusted to 10 mM Na-acetate (pH 5). 0.25 volumes of 
25 mM Iodine in tetrahydrofuranAwater (1:1) is added and ttie sample is incubate at 37 
25 'C for 2h. The reaction is then quenched by addition of 2 pi of 1 M NazSaOa and incuba- 
tion at room temperature for 5 min. The complexes are then purified by spining through 
Biospin P-6 columns (Biorad) equilibrated with water 
♦■ 

To remove sulphonamide protecting groups, the sample is adjusted to 50 pi 100 mM 
30 sodium borate pH 8,5 and 20 pi 500 mM 4-metiioxy ttiiophenol (in acetonitrile) Is 

added and ttie reaction is incubated at 25*»C ovemight Then the complexes are puri- 
fied by spinning tiirough Biospin P-6 columns (Biorad) equilibrated with water and then 
iyophilized. 



35 



Third round of library gen firfifinn (round C) : 
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The samples are dissolved in 175 ijI 100 mM Na-borate pH 8,0 and distributed into 25 
wells (7 jjl / well). 2 pi 100 mM BBcX in wp**»r/DMSO and 1 pi of 250 mM DMT-MM is 
added to each reaction and incubated at 30 °C ovemigth. Water is added to 50 pi and 
the reactions are then spin purified using Bio-Rad P6 DG (Desalting gel) and subse- 
5 quently water is allowed to evaporate so that the final volume is 1 0 pi. 

AddHipn of buMng block 

The DNA material is then iyophilized and redlssolved in 10 pi water containing 200 
pmol ollgo Cx (eg, CI) and the con-esponding oligo cx (eg. c1). This is done so that the 

1 0 codon in oligo Cx corresponds to the BBcX added to the DNA identifier. 1 0 units of T4 
DNA ligase (Promega) and 1.2 pi T4 DNA ligase buffer is then added to each tube and 
incubated at 20*^ for 1 hour. The DNAn Identifier linked to the small molecules now 
comprises and Ax oligo with a Bx ligated to its 3' end and a Cx oligo ligated to the 3' 
end of the Bx oligo. The reactions are then pooled, the pooled sample volume is re- 

15 duced by evaporation and the sample is purified by spining through Biospin P-6 col- 
umns (Biorad) equilibrated with water, the pooled sample 50 pi) is adjusted to 10 
mM Na-acetate (pH 5). 0.25 volumes of 25 mM Iodine in tetrahydrofuran/water (1 :1) is 
added and the sample is incubate at 37 *C for 2h- The reaction is then quenched by 
addition of 2 pi of 1 M Na2S203 and incubation at RT for 5 min. Then the DNA identifi- 

20 ere (carrying small molecules) are purified by spinning through Biospin P-6 columns 
(Biorad) equilibrated with water and then lyophilized. 

Final deprotection step 

Some building blocks contain methyl esters that are deprotected to acids by dissolving 
25 the pooled sample in 5 pi 20 mM NaOH, heating to 80 ''C for 10 minutes and adding 5 
pi of 20 mM HOI. 

Final extension step 

To ensure that the DNA identifiers are double stranded prior to selection oligo d is ex- 
30 tended along the identifier by adding to the sample 10 pi of 5 X sequenase EX-buffer 
[100 mM Hepes. pH 7.5. 50 mM MgCb. 760 mM NaCI] and 4000 pmol oligo d. Anneal- 
ing Is performed by heating to 80*C and cooling to 20 ^'C. To the sample is then added 
500 pL dNTP , water to 50 pi and 39 units of Sequenase version 2.0 (USB). The reac- 
tion is incubated at 37®C for 1 hour. 
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Selecdon 

This library is subjected to selection, whereby binders to the selection target are en- 
riched. 

Maxisorp ELISA wells (NUNC A/S, Denmark) were coated with each 100 2Mg/mL 
integrin aVp3 in PBS buffer [2.8 mM NaHaPO^. 7.2 mM Na2HP04. 0,15 M NaCI, pH 7.2] 
overnight at 4'C. Then the integrin solution was substituted for 200 jjI blocking buffer 
[TBS, 0.05% Tween 20 (Sigma P-9416), 1% bovine seaim ainumin (Sigma A-703G), 1 
mM MnCIa] which was left on for 3 hours at room temperature. Then the wells were 
washed 10 times with blocking buffer and the encoded library was added to the wells 
after diluting it 100 times with blocking buffer. Following 2 hours incubation at room 
temperature the wells were vrashed 10 times with blocking buffer. After the final wash 
the wells were cleared of wash buffer and subsequently inverted and exposed to UV 
light at 300-350 nm for 30 seconds using a trans-illuminator set at 70% power. Then 
100 pi blocking buffer without tween-20 was immediately added to each well, the wells 
were shaken for 30 seconds, and the solutions containing eluted identifiers were re- 
moved for PGR amplification. 

Cloning 

A TOPO-TA (Invrtrogen) ligation reaction is assembled with 4 pi PGR product, 1 pi salt 
20 solution (Invrtrogen) and 1 pi vector Water is added to 6 pi. The reaction is then incu- 
bated at RT for 30 min. Meat-shock competent TOPI 0 E.coli cells are then thawed on 
ice and 5 pi of the ligation reaction is added to the thawed cells. The cells are then in- 
cubated 30 min on ice, heatshocked in 42^ water for 30 sec. and then put on Ice 
again. 250 pi of growth medium is added to the cells and they are incubated 1 h at 
25 37<k:, The medium containtir^ cells Is then spread on a growth plate contaW^ 
/ ml ampicillin and incubated at 37*^:: for 16 hours. 

Sequencing 

Individual Eco// clones are then picked and transferred to PCR wells containing 50 pi 
30 water. These 50 pi were incubated at 94*C for 5 minutes and used in a 20 pi In a 25 pi 
PGR reaction with 5 pmol of each TOPO primerl\/l13 fonward & M13 reverse and 
Ready-To-Go PCR beads (Amereham Biosciences), The following PGR profile is used: 
94«C 2 min, then 30 x (94°C 4 sec. 50"G 30 sec, 72"G 1 min) then 72*C 10 min. Prim- 
ers and nucleotides are then degraded by adding 1 pi 1:1 EXO/SAP mixture (USB 
35 corp,) to 2 pi PGR product and incubating at 37**G for 1 5 min and then 80'»C for 1 5 min 
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to heatH'nactrvate the enzymes. 5 pmol 17 primer is added and water is added to 12 pi. 
Then 8 \i\ DYEnamic ET cycle sequencing Tenninator Mix (Applied biosystems) is 
added to each well. A themiocycling profile of 30 x (95°C 20 sec, 50*^ 15 sec, 60°C 1 
min) is then am. Then 10 pi water is added to each well and sequencing reactions are 
5 purified using seq96 spinplates (Amersham Biosciences). Reactions are then run on a 
MegaBace capillary electrophoresis instrument (Molecular Dynamics) using injection 
parameters 2 kV, 50 ^ ec and mn parameters: 9 KV 45 min and analyzed using Contig 
Express software (Infomnax). 

10 The chemical entities used in each position are shown below. 
Position 1 



Building Block 


Smiles 




/ 








P 


BB-A-000098 


structures 


BB-A-O00112 


Structure123 




XXjQ 


8&A-000282 


StructureSI 


BB-A-000283 


O ^NH f/^^\J 
Structure82 



Position 2 



— . 














J- 












































on 










— 








— - 
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Position 3 



BBA0000531 




O 

Structurel 



BBA0001006 


OG 

N N 
H 

Structure2 


o 

^«-^ ^OH 


BBA0001391 


o 

structures 




BBA0001401 


o 

HOT ^^-^ ^ 

Structure4 




BBA0008312 


Structures 


BBA0d08512 


Structures 


h 

BBA0008612 i 


ID N NH 

Structure? 





After the selection as descn'bed above, the cpdons in the identifier oligonucleotides 
were analysed. Before the analysis, the identifier oligonucleotides were amplified using 
the constant flanking regions and the amplified material was used in the identifier se- 
quence analysis. . , 
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A sequence codon analysis of the selected codons showed a bias for specific chemical 
entities/They are listed in the table below. For instance, in position 1 chemical entity 98 
was seem 47 times (out of 51 sequences. compare to 25 % before the selection) 
and chemical entity 99 was seen 14 times (out 51 sequences,. 27%. compare to 4% 
before selection) and chemical entity 53 was seen 35 times (out of 51 sequences. 68%. 
compare to 14% before selection). 

The chemical entities listed m the table below can then be used to generate a new and 
more focused library. 



ongo(-s) 


Count 


pos 1 


pos 2 


pos 3 




: 4|v:,. 


. T'. . .. 






BB-A-000282 • 


4 


282 







BBA0003172 




BBA0004212 



BBA0004232 



6BA000064 



BBA0OO1011 



BBA0003132 



BBA0003142 



BBA0003152 




The new focused library with the selected chemical entities can be selected against 
the target and the outcome from the seiecfion can be analysed. The most abundant 
binders will be the combination between the chemical entities 98-99-53 and. the second 
most abundant binder is 98-1 58-53 as shown below. 



wo 2004y074429 



PCT/DK2004/000117 



Oligo(-s) 


Count 


pos 1 


pos 2 


pos 3 


BB-A-OOOOSS BBA000099 BBA0000531 


11 


I ^^".^ 






BB-A-00009d BBA0001582 6BA0000531 


7 


; 9%: 






BB-A-000098 BBA0004242 BBA0000531 


4 


98^, ■ 




BB-A-000098 BBA00015a2 BBA0001391 


3 


;:. , 98*/. V 






BB-A-000098 BBA0004182 BBA0000531 • 


3 




418 




BB-A-000098 BBA000099 BBA0001391 


2 


- 98-. 1 


BB-A-000098 BBA0001582 BBA0001006 


2 


■■, 88- 

*»/ --.1 


100 



This example exemplifies the possibility to reduce the library diversity by using the en- 
riched chemical entities in a new library and perfonn another round of selection on the 
chosen chemical entities. 

Example 7 

The following experiment illustrates the principle of chemical entity (also termed build- 
ing block herein) evolution through multiple rounds of library generation and selection. 
The e^qseiiment is not intended to limit the scope of the current invention. 

Libraries were assembled by the combination of building blocks (BB) each of which 
was encoded by an oligonucleotide (oligo). Some of the building blocks carried an 
amine functional group and a cari30xylic acid functional group. The buikling block 
amine was protected by A/-pentenoyIation and deprotected by iodine treatment prior to 
tiie reaction of the following building block. Oligonucleotide 1 (Oligol) carried an amine 
functional group to allow reaction with the building blocki 's carboxylic acid and oli- 
gonucleotides are optionally derivatized by phosphorylation to allow l^ation. Oligonu- 
cleotides (oligoS) also comprised a primer region for PGR amplification. EDCyNHS. t* 
EDC/sulfoNHS or DMTMM was used as coupling reagents. 

The following scheme describes the split and mix assembly of tire libraries: 

J.) n times [BBI + Ofigol ^ B61 -Oligol] in separate wells 

* Optionally purify product 
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ii. ) mix all n wells into one tube 

iii. ) split product of ii.) into m separate wells 

5 

iv. ) m times [BB2 + BBI^Iigol + Oligo2 BB2-BB1^ligo1.0ligo2] in separate wells 

* Optionally purify product 

10 V.) mix an m wells into one tube 

vi. ) spilt product of V.) into p separate wells 

vii. ) p times [BB3 + BB2-BB1-Oligo1-OIigo2 + Oligo3 BB3.BB2.BB1-Ongo1-Oligo2. 
15 Oligo3] in separate wells 

* Optionally purify product 

viii. ) mix ail p wells into one tube 

20 

ix. ) Selection was perfom^ed and binders isolated 
X.) PGR of DMA and sequencing 

25 xij Analyse for bunding block abundancy and fun sequence informati^^ 



Building block abundances analysis may be done by QPCR or by sequencing fuH se- 
quences and then analyzing for the abundance of individual building blocks. 

SQ The fonowing types of bunding blocks were used, wherein R describes a group which Is 
varied for different building blocte: 
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Building block types used in position 1 , 2 and 3 



R O o 



•Both (/^ and (S) isomers were for some 
R-groups used 



O R O -I U 



Both (R) and (S) isomers were for some 
R'>groups used 



I? 

O O 



Building block types which were only used in positkm 3 



The overall process leads to molecules of the following structure, where the oligonu- 
cleotide was double stranded. 



Positions 



Position2 



R— II I? 
"^N-(CHR%— ^, 

R" 



Positioni 
O 

(CHROn— •L, 



av|33 Int^rin receptor bnding region 
CXigonudeotide 

(PEG-spacer)rOfigot<)ngQ2-ORgo34'rinf«r 



R* = H or R (as indicated for building blocks) 
R" H or R (as indicated for buikiing blocks) 
R = H or R (as indicated for tHjOding bkscks) 
n = 1-2 



The oligonucleotide was made double stranded by the use of double stranded Oligo's 
1, 2 and 3 with an overhang to allow ligation of both strands. 



Summary of the experimental outcome: 
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Two libraries of 61 .875 members (Library 1 and 2) were generated as desoibed in ex- 
ample 6 above and selected for binders of*\he Integrin avp3 receptor separately. 
The libraries were generated with 99 differer/t building blocl<s in position 1, 25 different 
building blocks in position 2 and 25 different building blocks ia position 3. 
5 The identified sequences were then analyzed for the abundarices of building blocks at 
each position in the sequence. The most abundant building blocks at each position 
from the two libraries 1 and 2 were then used again to generate a new and smaller 
library of 1,366 members, which was selected for binders of the Integrin avp3 receptor. 
The library was generated with 7 different building blocks in position 1,13 different 

10 building blocks in position 2 and 15 different building blocks in position 3. 

In the tables below, each of the buikJing block numbers Identify one specific building 
block or in two instances (library 1) a mixture of three different building btocks. TTie 
same numbers are used for each building block in ail libraries, however the oligonu- 
cleotide used to identify each building block may not necessarily be the same between 

15 libraries to avoid potential problems of cross contamination. 

The following tables describes the codon sequences and corresponding building blocks 
used. The codon is only indicated for one of the strands. 

>. 

20 Library 1. Position 1 



Codon no. 


Codon sequence 
ID 


Building Block 
ID 


1 


TGTTC 


BBA00a092 


2 


CGAGC 


BBA000354 


3 


GGATA 


BBA000085 


4 


CGGTG 


BBA000086 


5 


GTTAT 


BBA000098 


6 


AGTGC 


BBAO0()099 


7 


ACCTG 


BBA000089 


8 


CTGGT 


BBA000090 


9 


TAGGA 


BBA000087 


10 


ACTCA 


BBA000088 


11 


CTTAC 


BBA00ai53 


12 


CGCAC 


BBA000154 


13 


TCGGG 


BBA000059 
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14 


CGGAT 


BBA000152 


15 


GAGAT 


BBA000101 


16 


TGTAG 


BBA000110/ 


17 


GTGTT 


BBA000112 


18 


AGATG 


BBA000113 


19 


ATCCT 


BBA000114 


20 


TTGCT 


6BA000286 


21 


ACGTA 


BBA000123 


22 


ATCAC 


BBA000124 


23 


TATCC 


BBA000155 


24 


GGAAG 


BBA000156 


25 


CGGTC 


BBA000158 


26 


TGCTT 


BBA000159 


27 


TTAGC 


BBA000160 


28 


GCTGA 


BBA000161 


29 


GAACG 


BBA000162 


30 


CATGG 


BBA000163 


31 


TGGTA 


BBA000165 


32 . 


TCAAG 


BBA000166 


33 


ATCGA 


BBA000167 


34 


ATGCA 


BBA000168 


35 


ACTAG 


BBA000169 


36 


TACCT 


BBA000170 


37 


TACGA 


BBA000171 


38 


CTTCA 


BBA000172 


39 


CTCTT 


BBAQ00173 


40 


TCATC 


BBA000174 


41 


ATTCC 


BBA000176 


42 


CGACG 


BBA000176 


43 


CCTGT 


BBA000177 


44 


CCTTC 


BBA000178 


45 


ACACC 


BBA000179 


46 


TAACA 


BBA000180 


47 


TAACA 


BBA000098 


48 


CCAGG 


BBA000181 


49 


ATGTC 


BBA000182 


50 


GAGGA 


BBA000183 
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61 


GGTCA 


BBA000184 


62 


GACTT 


BBA000185 


53 


GGTGG 


BBAOG0186 


54 


CAACT 


BBA000190 


65 


ATGAG 


BBA000195 


56 


TCTGC 


BBA000196 


67 


ATAGG 


BBA000197 


58 


CTACC 


BBA000198 


59 


AAGTG 


BBA00Q201 


60 


TCCAA 


BBA000202 


61 


GCTCT 


BBA000203 


62 


GGAGT 


BBA000204 


63 


AATCG 


BBAP00205 


64 


AAGCT 


BBA000206 


65 


CCGAA 


BBA000207 


66 


TTTGT 


BBA000208 


67 


CCGTG 


BBA000209 


68 


T7TCG 


BBA000210 


69 


TGAGG 


6BA000211 


70 


GTTGC 


BBA000212 


71 


AACTA 


BBA000112 


72 


AACTA 


BBA000280 


73 


CCTCG 


BBA00Q281 


74 


AGCAA 


BBA000282 


75 


TTCCA 


BBA000313 


76 


AGACT 


BBA000314 


77 


AGGTT 


BBA000315 


78 


GCGTC 


BBA000316 


79 


AACGT 


BBA000317 


80 


CAAGA 


6BA000287 


81 


AGAGA 


BBA000419 


82 


GTACT 


BBA000420 


83 


TAGAG 


BBA000421 


84 


ACGAT 


BBA000422 


85 


GACCA 


BBA000200 


86 


TGGTT 


BBA000194 


87 


GTCTC 


BBA000427 
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88 


CAGCA 


BBA000428 


89 


TAGTC 


BBA000199*'a 


90 


GGGTG 


BBA000187/^ 


91 


CTCAG 


BBA000191 


92 


AGAAC 


BBA000284 


93 


GCGAG 


BBA000458 


94 


GATGT 


BBA000459 


95 


TCACT 


BBA000461 


96 


CGTCT 


OBA000610 


97 


AGCTC 


OBA000611 


98 


CACTC 


OBA000609 


99 


CAGTT 


OBA000615 



Library 1. Position 2 



Codon 
no. 


Codon sequence ID 


Building Block ID 


1 


AGTACGAACGTGCATCAGAG 


BBA000098 


2 


TAGTCTCCTCCACTTCCATG 


BBA000099 


3 


TACATCGTTCCAGACTACCG 


BBA000085 


4 


TCCAGTGCAAGACTGAACAG 


BBA000153 


5 


AGCATCACTACTCTGTCTGG 


BBA000206 


6 


TCTTGTCAACCTTCCATGCG 


BBA000200 


7 


AAGGACGTTCCTAGTAGGTG 


BBA000208 


8 


GGAACCATCAAGATCCTGAG 


BBA000091 


9 


ATCTCTGACGAGATCCAAGG 


BBA000090 


10 


TCAAGGTTGGTGGTGTACTG 


BBA000092 


11 


TCGAACTTGTTGCTTCCTCG 


BBA000123 


12 


CTGAGTGTGTAGTACCAACG 


BBA000156 


13 


ATCTTGGTTGTTCTCCTGCG 


BBA000163 


14 


TAGTAGCTTGGAGTAGACCG 


BBA000197 


15 


TTCACTCCATGCAGCATGTG 


BBA000083 


16 


ACGATGGTGATCGATCAACG 


BBA000181 


17 


TTCAGTGC7TGAGCTACCTG 


BBA000152 


18 


TTGGACTCTTCTTGCACCAG 


BBA000088 


19 


TCAACCAACTGGTTCTTGGG 


BBA000100 
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20 


TAGTACTCTACACTGCTGCG 


BBA00087/101/196 


21 


TACACCATGACTTGCAGACG 


BBA00087/101/196 


22 


GCATC1TGAGTCG7TGAACG 


BBA000059 


23 


GACTCATCTCACTGGAGTTG 


BBA000124 


24 


TCCAGCTTCTAGGAAGACAG 


BBAOOOieO 


25 


CTTCTTGAGTGCACTAGCAG 


BBA000201 



Library I.Posftion 3 



Codon 
no. 


Codon sequence ID 


DuiiGing DiocK 
in 


1 


CGAGCAoGACCr 1 oo/V\UO 1 <jo 1 00 


DtSAUUlAJsra 


2 


C T CGACCACTGCAC30 1 CjoAoO I OL» 


nnAnnnnoo 


3 


CGTGCTTCC/ 1 U 1 1 (aOAuOAwUia 


RRAnnnnRc; 

Q D/\UUUU03 


4 


/-\/-i-j-/^/^T"/^T/^/^ ft/^nTn Anr* Ar2(~^ Ar^f^ 


RRAnnni 

E3DA%UV/U 1 


5 




RRAnnnofiR 

DO/MiUU^UQ 


6 


L»V3 1 oAooAoOAoo 1 i v^Lf I 0 I 00 


RRAnnn9nn 


7 


00 1 oMOMw I V30 1 00 1 U\3 1 ^^or\\JV^w 


RRAnnnoQfl 


o 

0 






n 


rr APfiAGGTCTCCACTGGTCCAGG 


BBA000090 


in 


rrArTRAGCTGCTCCTCCAGGTGG 


BBA000092 


11 


CCTCCTGTCCTGCACGTCCATCCG 


BBA000123 


12 


CAGCACCTGGAGGTAGGACCACGG 


BBA000156 


13 


CGACCAGACGAGGACCAGGTAGGC 


BBA000163 


14 


CCAGGTrCGAGGACCTCGTCAGGC 


BBA000197 


15 


CGAGCACGAGGAGGACGTGTCGAG 


6BA000100 


16 


CCACGTCCACAGGTGCACCAGGTG 


BBA000181 


17 


CCTGGTGCTCCACGACGTGCTTCG 


BBA000152 


18 


CACGTGACGACCTGGTCAGGTGGG 


BBA000088 


19 


GGTAGCTCGTGCTGGTCCTCCTGG 


BBA000101 


20 


CGACGACCACCACCTTGGACACCC 


BBA000196 


21 


CCTACGTCGTGCTCACGTCCTGGC 


BBA00087 


22 


CGACGACAGCTAGGAGGAGGTGGG 


BBA000083 


23 


CTGGTGGAGCTGCACGAGCACAGG 


BBA000059 


24 


CAGGACTGGACGACGACCAGGTCG 


BBA000124 


25 


CGATGCTGCAGAGGACCAGCACCC 


BBA000160 
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Codon.no 


Cordon sequence 
ID 


Building Block 
ID 


1 


TGTTC 


BBAP00092 


2 


CGAGC 


BBA000354 


3 


GGATA 


BBA000085 


4 


CGCTG 


BBAOGGOSe 


5 


GTTAT 


. BBA000098 


6 


AGTGC 


BBA000099 


7 


ACCTG 


BBA000089 


8 


CTGGT 


BBA000090 


9 


TAGGA 


BBA000087 


10 


ACTCA 


BBA000088 


11 


CTTAC 


BBA000153 


12 


CGCAC 


BBA000154 


13 


TCGCG 


BBA000059 


14 


CGGAT 


BBA000152 


15 


GAGAT 


BBA000101 


16 


TGTAG 


BBA00G110 


17 


GTGTT 


BBA000112 


18 


AGATG 


BBA000113 


19 


ATCCT 


BBA000114 


20 


TTGCT 


BBA000286 


21 


ACGTA 


BBA000123 


22 


ATCAC 


BBA000124 


23 « 


TATCC 


iSBAOOOISS 


24t 


GGMG 


BBA000156 


25 


CGGTC 


BBA000158 


26 


TGCTT 


BBA000159 


27 


TTAGC 


BBA000160 


28 


GCTGA 


BBA000161 


29 


GAACG 1 


BBA000162 


30 ( 


CATGG I 


3BA000163 


31 


FGGTA I 


3BA000165 
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32 


TCAAG 


BBA000166 


33 


ATCGA 


BBA000167 


34 


ATGCA 


BBA0001SC 


35 


ACTAG 


BBA000169 


36 


TACCT 


BBA000170 . 


37 


TACGA 


BBA000171 


38 


CTTCA 


BBA000172 


39 


CTCTT 


BBA000173 


40 


TCATC 


BBA000174 


41 


ATTCC 


BBA000175 


42 


CGACG 


BBA000176 


43 


CCTGT 


BBA000177 


44 


CCTTC 


BBA000178 


45 


ACACC 


BBA000179 


46 


TAACA 


BBA000180 


47 


TAACA 


BBA000098 


46 


CCAGG 


BBA000181 


49 


ATGTC 


BBA000182 


50 


GAGGA 


BBA000183 


51 


GGTCA 


BBA000184 


52 


GACTT 


BBA000185 


53 


GGTGG 


BBA000186 


54 


CAACT 


BBA000190 


55 


ATGAG 


BBA000195 


66 


TCTGC 


6BA0a0196 


57 


ATAGG 


BBA000197 


56 


CTACC 


BBA000198 


59 


AAGTG 


BBA000201 


60 


TCCAA 


BBA000202 


,31 


GGTCT 


BBA000203 


62 


GGAGT 


BBA000204 


63 


AATCG 


BBA000205 


64 


AAGCT 


BBA000206 


65 


CCGAA 


BBA000207 


66 


TTTGT 


BBA000208 


67 


CCGTG 


BBA00Q209 


68 


TTTCG 


BBA000210 
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69 


TGAGG 


BBA000211 


70 


GTTGC 


BBA000212 


71 


AACTA 


BBA000112 


72 


AACTA 


BBA000280 


73 


CCTCG 


BBA000281 


74 


AGCAA 


BBA000282 


75 


TTCCA 


BBA000313 


76 


AGACT 


BBA000314 


77 


AGGTT 


BBA000315 


78 


GCGTC 


BBA000316 


79 


AACGT 


BBAQ00317 


80 


CAAGA 


8BA000287 


81 


AGAGA 


BBA000419 


82 


GTACT 


BBA000420 


83 


TAGAG 


BBA000421 


84 


ACGAT 


BBA000422 


85 


GACCA 


BBA000200 


86 


TCGTT 


BBA000194 


87 


GTCTC 


BBA000427 


88 


CAGCA 


BBA000428 


89 


TAGTC 


BBA000199 


90 


GGGTG 


BBA000187 


91 


CTCAG 


BBA000191 


92 


AGAAC 


BBA000284 


93 


GGGAG 


BBA000458 


94 


GATGT 


BBA000459 


95 


TCACT 


BBA000461 


96 


CGTCT 


OBA000610 


97 


AGCTC 


OBA000611 


98 


CACTC 


OBA000609 


99 


CAGTT 


OBA000615 



Library 2. Position 2 



Codon 


Codon sequence 10 


Building Block 


no. 




ID 
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1 


AGTACGAACGTGCATCAGAG 


BBA000059 


2 


TAGTCTCCTCCACTTCCATG 


BBA000085 


3 


TACATCGTTCCAGACTACCG 


6BA000098 


4 


TCCAGTGCAAGACTGAACAG 


BBA000099 


5 


AGCATCACTACTCTGTCTGG 


BBA000101 


6 


TCTTGTCAACCTTCCATGCG 


BBA000110 


7 


AAGGACGTTCCTAGTAGGTG 


BBA000113 


8 


GGAACCATCAAGATCCTGAG 


BBA000114 


9 


ATCTCTGACGAGATCCAAGG 


B6A000123 


10 


TCAAGGTTGGTGGTGTACTG 


BBA000124 


11 


TCGAACTTGTTGCTTCCTCG 


B6A000152 


12 


CTGAGTGTGTAGTACCAACG 


BBA000158 


13 


ATCTTGGTTGTTCTCCTGCG 


BBA0OO16O 


14 


TAGTAGCTTGGAGTAGACCG 


BBA000161 


15 


rrCACTCCATGCAGCATGTG 


BBA000167 


16 


ACGATGGTGATCGATCAACG 


BBA000176 


17 


TTCAGTGCTTGAGCTACCTG 


BBA000181 


18 


TTGGAGTCTTCTTGCACCAG 


BBA0aO313 


19 


TCAACCAACTGGTTCTTGGG 


BBA000314 


20 


TAGTACTCTACACTGCTGCG 


BBA000315 


21 


TACACCATGACTTGCAGACG 


BBA000316 


22 


GCATCTTGAGTCGTTGAACG 


BBA000317 


23 


GACTCATCTCACTGGAGTTG 


BBA000420 


24 


TCCAGGTTCTAGGAAGACAG 


BBA000421 


25 


CTTCTTGAGTGCACTAGCAG 


BBA000422 



Library 2. Position 3 



Codon 


Codon sequence ID 


Buildinjg Block 


no. 




ID 


1 


CGAGCAGGACCTGGAACCTGGTGC 


BBA000052 


2 


CTCGACCACTGCAGGTGGAGCTCC 


BBA000053 


3 


CGTGCTTCGTCTGCTGCACCACCG 


BBA000064 


4 


CCTGGTGTCGAGGTGAGCAGCAGC 


BBA000056 


5 


CTCGACGAGGTCCATCCTGGTCGC 


BBA000067 


6 


CGTGAGGAGCAGGTCCTCCTGTGG 


BBAOOGOSS 
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c 



7 


CCTGACACTGGTGGTGGTCGAGGC 


BBA000062 


8 


CCATCTCGACGACCTGCTCCTGGG 


BBA000139 


9 


CCACGAGGTCTCCACTGGTCCAGG 


BBA000140 


10 


CCACTGAGCTGCTCCTCCAGGTGG 


BBA000100 


11 


CCTCCTGTCCTGCACGTCCATCCG 


BBA000059 


12 


CAGCACCTGGAGGTAGGACCACGG 


BBA0Q0O85 


13 


CGACCAGACGAGGACCAGGTAGGC 


BBA000098 


14 


CCAGGTTCGAGGACCTCGTCAGCC 


BBA000099 


15 


CGAGCACGAGGAGCACGTGTCCAG 


BBA000101 


16 


CCACGTCCACAGGTGCACCAGGTG 


BBA00G110 


17 


CCTGGTGGTCCACGACGTGC7TCG 


BBA000113;. 


18 


CACGTGACGACCTGGTCAGGTGGG 


BBA000114 


19 


CGTAGCTCGTGCTGGTCCTCCTGG 


BBA000123 


20 


CGACGAGCACCACCTTGGACACCC 


BBA000124 


21 


CCTACGTCGTGCTCACGTCCTGCC 


BBA000152 


22 


CGACGACAGCTAGGAGGAGGTGGG 


BBA000158 


23 


CTGGTGGAGCTGCACGAGCACAGC 


BBA000160 


24 


CAGGACTGGACGACGACCAGGTCG 


BBA000161 


25 


CGATGCTGCAGACGACCAGCACCG 


BBA000167 



Library 3, Position 1 





Codon sequence 


Building Block 


lUlore abundant in position 1 


Codon no. 


ID 


ID 


inlibraiyno. 


1 


TGTTC 


BBA000092 


1 


2 


AGTGA 


BBA000088 


1 


3 


CTTAC 


BBA000153 


1and2 


4 


CGGAT 


BBA000152 


1 , 


5 


ATTGG 


BBA000175 


1andt2 


6 


GTGTG 


BBA000427 


1 . 


7 


ACAGT 


BBA000098 


1and2 



Library 3. Position 2 



Codon no. 



Codon sequence ID 



j Building Block ID j More abundant in in positio 
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2 in libraiy no. 


1 


6CACAAGTACGAACGTGCATCAGAG 


BBA000059 


1 


2 


eCACATAGTCTCCTCCACTrCCATG 


rDA000083 


1 


3 


6CACATACATCGTTCCAGACTACCG 


BBA000086 


2 


4 


6CACATCCAGTGCAAGACTGAACAG 


BBA000088 


1 


5 


6CACAAGCATCACTACTCTGTCTGG 


BBA000090 


1 . 


6 


6CACATCTTGTCAACCTTCCATGCG 


BBA0O0099 


1 and 2 


7 


6CACAAAGGACGTTCCTAGTAGGTG 


BBA000110 




8 


eCACAGGAACCATCAAGATCCTeAG 


BBA000114 


2 


9 


6CACAATCTCTGACGAGATCCAAGG 


BBA000152 


2 


10 


6CACATCAAGGTTGGTGGTGTACTG 


B6A000160 


2 


11 


6CACATCGAAC TTGTTGCTTCCTCG 


BBA000200 


1 


12 


6CACACTGAGTGTGTAGTACCAACG 


BBA000201 


1 


13 


6CACAATCTTGGTTGTTCTCGTGGG 


BBA000422 


2 



Library 3, Position 3 



Codon 
no. 


Codon sequence ID 


Building Block 
ID 


More 
abundi 
In in pc 
tion 3 i 
library 


1 


SAGGACGAGCAGGACCTGGMCCTGGTGCGTTCCTCCACCACGTCfCCG 


BBA000053 


2 


2 


SAGGACTCGACCyVCTGCAGGTGGAGCTCCGTTCCTCCACCACGTCTCCG 


BBA000085 


1 


3 


eAGGACGTOCTTCCTCTGCTGCyVCCACCGGTrCCTCCySCCAan^CTC^ 


BBA000087 


1 


4 


6AGGACCTGGTGTCGAGGTGAGCAGCAGCGTTCCTCCACCACGTCTCCG 


BBA000090 


1 


5 


6AGGACTCGACGAGGTCaVTCXrr6GTCGCGTTCCTCCACXACGTCU^ 


BBA00a091 


1 


6 


6AGGACGTGAGGAGCAGGTCCTCCTGTCGGTrCCTCCACCA(Xn"CTtXX3 


BBA000098 


1 


7 


6AGGACCTGACACTGGTCGTGGTCGAGGCGTTCCTCCACCA(JgTCTCXG 


BBA000100 


land 2 


8 


6AGGACCATCTCGACGACCTGCTCCTGGGGTTCCTCGACCACGTCTC^ 


BBA000139 


2 


9' 


6AGGACCACGAGGTCTCCACTGGTCCAGGGTTC(rrCCACCAC6TCT^ 


BBA000140 


2 


10 


6AGGACCACTGAGCT^GGT(XTCCAGGTGGGTTCCTCCACCACGTCTCCG 


BBA000152 




11 


6AGGACCrnXTGTCCTGCACGTCCATCCGGTTCCTCCACCACGTCTCCG 


BBA000153 


1 


12 


6aggacagcacctggaggtaggaccacgggttcctcx:accacgtctccg 


BBA000161 




13 


SAGGACGAOIAGACGAGGACCAGGTAGGCGTTCCTCCACCACGTCTCCG 


BBA000167 


2 


14 


6AGGACCAGGTTCGAGGACCTCGTCAGCCGTTCCTCCACCACGTCTCCG 


BBA000197 


1 
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[15 I fUkGGACGAGCACGAGGAGCACGTGTCCAGGTrCCTCQVCCACGTCTCCG 



BBA000200 



A subset of the isolated sequences from the library post selection was analysed: 



^CAGCACACTCCTCGCACATACATCGTTCCAGACTACCCyVGGACC^^ 
(2) G6CAGCACAGT 

C«nCGCTACATCCTT(nt:AACCrnCCATGCGAGTACCTTACACTGG^ 

^Ga»QAT423COTCGCACATmGTX»ACCTTC(»TGCGAfi^^ 

CT . 

TCCTC 
(5) 

GGCA«»CA6rcGTCGCACATCAnGTA<»AACCTTCOKTGCGAGGACCAT^ 

CTC 
(6) 

GGCAGCACACTaSTCGCACATCTTGTCAACCTTTCATGC^ 

(n 

GGCAGCACAGTCGTO 



iGCACATCTTGTCMCCrrCCATGCGAGGACCATCTCGACGACCTGCTCCTGGGG^ 
(8) 

GGCAGCACAGTCGTCGCACATXnTGTCMCCTTCCATGCGAGGACC^^ 
(9) 

GGCAGCACAGTCGTCGCACATCTTGTCMCmCCATGCGAGGACCA^ 

(iO) 

GGCAGCACAC?T(XSTCGCACATm6TCAA(XnCCATGCGA^ 

22:AGCACrAGATCGTCGCACyVTCTTGTCAACC^^ 
TTCCTC 

(12) GGCAGCACAGAT 

CGTCGCACATOTGTCAAanTCCATGCGAGGACeATCTC^ 

(13) ^ ^r. 

GGCAGCACy^GTXX3TCGCACATCTrGTCAACCnCCATGCGA<3G^ 
CTC 

j^^LvGCACAGTCGTCGCACATOT 
(15) 

GGCAGCySCAGTCGTCGCACATCnGTCAACOT 
(16) 

GGCAGCCGGATCGTCGCACATCTTGTCMCCTTOVVT^ 
^^CAGCCGGATCGTCGCACATCncni^ 

(18) . 

GGCAGCCGGATCGTCGCACATCTTGTCMCCTTCCATGOBAG^ 

(19) GGCAGCACAGTCGTCGCMTCCAGTCMGACTGAACAGAGGACCATC^ 
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(20) 

GGCAGCACAGTCGTCGCACATCrrrGTCAACCTmCGATGCGAGGACGA 
TC ^ 

5 GGCAGCACAGTCGTCGCACATCTTGTCACXTTCCATQfcGAGGACGAGCAGGACCTGi^ 
(22) 

GGCAGCACAGTCGTCGCACATCTTGTCWVCeTTCCATGCGAGGACGATGCAGGAC^^ 
C 

1 0 GGCAGCXXSGATCGTCGCACATXnTGGTNMNCTTCCATGCGA^ 
C 

<24)GGCGGATCGTCGCACATCTTGTCMCCrrTCCATGCGAGGACCACGAGGTCTCX^ 
(25) 

GGCAGCACAGTCGTCGGCACATCnrTGGTCAACCTTCCVKTGaSAGGACGACGAGCT 
15 CTC 
(26) 

GGCAGCCGGATCGTCGCACATCTTGTCAACCTTCCATGaSAGGACGACCAAGACGA 
(27) 

GGCAGCCGGAT423CGTCGCACATCTTGTCAACCTTCCATGCGAGGACGTGATGGAGCAAGTCCTC^^ 
20 CTC 
(28) 

GGCAGCACAGTCGTCGCACATCTTGTCAACCTTCCATGCGAGGA^ 

(29) 

GCCCAAACMGTCGTCGCACATCnTGTCAACCrnCCATGC^ 
25 TCCT 
(30) 

GGAGCACAGATCGTCGCACATGCTTGTCAAGCCTTrCCATCGCGAGGAGCATCCTA 

CTGGGGTrC 

(31) 

30 GGCAGCCGGATGGTCGCACATCAATGGTnGGCTGGTreATACTGA 
TTCCTC 



These sequences could be translated into the following building block compositions: 



Sequence 
no. 


Postioni 


Position 2 


Positions 


1 


BBA000098 


BEfAOOOOSS 


BBA000100 


2 


BBA000098 


BBA000099 


BBA000100 


3 


BBA00O152 


BBA000099 


BBA000100 


4 


BBA000153 


BBA000152 


BBA000100 


5 


BBA00OO98 


6BA00009g 


BBA000139 


6 


BBA000098 


6BA000099 


BBA000139 


7 


BBA000098 


BBA000099 


BBA000139 


8 


BBA000098 


BBA000099 


BBA000139 
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9 


BBA000098 


BBA000099 


1 BBA000139 


10 


BBA000098 


BBA000099 


1 BBA000139 


11 


BBA000098 


BBA000099 ^ 


/BBA000139 


12 


BBA000098 


BBAOO0Og9 


BBAG00139 


13 


BBA000098 


BBAG00099 


BBA000139 


14 


BBA000098 


BBA000099 


BBA000139 


15 


BBA000098 


BBA000099 


BBA000139 


16 


BBA000152 


BBA000099 


BBA000139 


17 


BBA000152 


6BA000099 


BBA000139 


18 


BBA000152 


fiBA000099 


BBA000139 


19 


BBA00009a 


BBA000088 


BBA000139 


20 


BBA000098 


BBA000099 


BBAOG0053 


21 


BBA000098 


BBA000099 


BBA000053 


22 


BBA000098 


BBA00009g 


BBA000053 


23 


BBA000152 


eBA000099 


BBA000053 


24 


BBA000152 


BBA000099 


BBA000140 


25 


BBA000098 


BBA000099 


BBA000140 


26 


BBA000152 


BBA000099 


BBA000167 


27 


BBA000152 


BBA000099 


BBA000098 


28 


BBA000098 


B8AQ00099 


BBA00G200 


29 


BBA000098 


BBA000099 




30 


BBA000098 


BBA000099 




31 


BBA000152 


BBAOOOieO 





In position 1 L-Asp (BBA00098) dominated. D-Asp was also found (BBA000152) 
In position 2 Gly (BBA00099) dominated. 

In position 3 building blocks carrying an amidine and no amine functionality was found 
to dominate: 




BBADOD100 BBA000140 
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The most abundant sequience was thereby found to con-espond to the following staic- 
turei: 




BBA0001 39-BBA000099-BBA000098 



the following 3 sequences 
BBAOG0098-BBA000099-BBA0001 39 
10 BBA000098-BBA000099-BBA000100 . 
BBA000098-BBA000099-BBA000053 

out of the 31 identified sequences were selected for further analysis using an standard 
ELISA assay and thereby verified as binders of the avp3 Integrin receptor, 

15 While the invention has been described with references to specific methods and em- 
bodiments, it will be appreciated that various modifications and changes may be made 
without departing from the Invention. All patent and literature references cited herein 
are hereby incorporated by reference in their entirety. 



20 
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Claims 

1 . A method for producing a composrtion of molecules with an improved desired 
property, comprising the steps oft 

i) providing an initial library comprising a plurality of different encoded 
5 molecules assodated with, a corresponding identifier nucleic acid sequence, 

wherein each encoded molecule comprises a reaction product of multiple 
chemical entities and the identifier nucleic acid sequence comprises codons 
. identifying said chemical entities, 

ii) subjecting the initial library to a condition partitioning members having 
10 encoded molecules displaying a predetermined property from the remainder of 

the initial library, 

iii) identifying codons of the identifier nucleic acid sequences of the 
partitioned members of the initial library, and 

iv) preparing a second-generation library of encoded molecules using the 
15 chemical entities coded for by the codons of the partitioned members of the Initial 

library or a part tiiereof. 

2. TTie method according to claim 1 , wherein the second-generation library 
comprises a plurality of different encoded molecules associated with a corresponding 
identifier nucleic acid sequence, wherein each encoded molecule comprises a reaction 

20 product of multiple chemical entities and the Identifier nucleic add sequence comprises 
codons identifying said chemical entities. 

3. The method of claim 1 or 2, further comprising subjecting tfie second 
generation library to a condition partitioning members having encoded molecules 
displaying a predetermined property from tiie remainder of the second generation 

25 library. 

4. The metiiod according to any of tfte claims 1 to 3, further comprising the step 
of deducing tiie identity of the encoded molecule(s} using the identifier nudeic add 

r. ■ 

sequence. * 

5. The metfiod according to daim 4. wherein the codons of the identifier nucleic 
30 acid sequence is decoded to establish the synthesis history of the encoded molecules. 

6. The method according to any of the daims 1 to 5, wherein ttie encoded 
molecule associated with the corresponding identifier nudeic add sequence is a 
bifunctional complex. 
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7. The method according to any of the claims 1 to 6, wberein the (encoded 
molecule is covalently associated with the conresponding identifier nucleic acid 
sequence. 

8. The method according to any of the claims 1 to 7. wherein the multiple 
5 chemical entities are precursors for a structural unit appearing in the encoded 

molecule. 

9. The method according to any of the claims 1 to 8, wherein the chemical 
entities are reacted without enzymatic interaction to produce the encoded molecule. 

10. The method according to any of the claims 1 to 9, wherein some or all 
1 0 chemical entities are not naturally occurring a-amino acids or precursors thereof. 

1 1 . The method according or any of the claims 1 to 1 0, wherein the encoded 
molecule is not aaa-polypeptide. 

12. The method according to any of the claims 1 to 1 1 , wherein each codon 
comprises 4 or more nucleotides. 

15 13. The method according to any of the claims i to 1 2. wherein the codons are 

separated by a framing sequence. 

14. The method according to claim 1 3. wherein the framing sequence positions 
the reaction of a chemical entity in the synthesis history of the encoded molecule, 

15. The method according to any of the claims 1 to 14, wherein the identifier 
20 nucleic acid sequence comprises two or more codons. 

16. The method according to any of the claims 1 to 15, wherein the identifier; 
nucleic acid sequence comprises three or more codons. 

17. The method according to any of the claims 1 to 16, wherein the identifier 
nucleic acid sequence is amplifiable and comprises codons identi^ng chemical 

25 entities, which have participated In tiie fomiation of the encoded molecule. 

18. The method according to any of the claims 1 to 17. wherein in step 11) the 
condition for partitioning of the desired library members includes subjecting the initial 
library to a moIecuTar target and partitioning members binding to said target 

19. The metiiod according to any of the claims 1 to 18, wherein the encoded 
30 molecule has a molecular weight less than 2000 Dalton. preferably less than 1000 

Dalton, and more preferred less than 500 Dalton. 

20. The method according to any of ttie claims 1 to 1 9, wherein ttie identifier 
nucleic acid sequence identifies the encoded molecule uniquely. 

' 21 . The metiiod according to any of the claims 1 to 20, wherein tiie identifier 
35 nucleic acid sequence is detached from tiie encoded molecule. 
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22. The method according or any of the claims 1 to 21 , wherein identrfter nucleic 
acid sequence prior to step iii) is amplified ^ 

23: The method of claim 22, wherein' the identifier nucleic acid sequence Is 
amplified applying the polymerase chain reaction (PGR). 
5 24. The method according to any of the claims 1 to 23, wherein the codons of the 

identifier nucleic acid sequences of the partitioned memt^ers of the initiai library are 
identified by contacting said identifier nucleic acid sequences with a pool of nucleic acid 
fragments under conditions allowing for hybridisation, 

25. The method according to claim 24, wherein the pool of nucleic acid fragments. 
10 comprises a plurality of single stranded nucleic acid probes immobilized in discrete 

areas of a solid support, wherein the nucleic acid probes are capable of hybridising to a 
codon of the identifier nucleic acid sequence comprising codons. 

26. The method of claim 25, wherein the identity of the codons is revealed by 
observing the discrete areas of the support in which a hybridisation event has occurred. 

15 27. The method according to any ofthe claims 24 to 26, wherein the nucleic add 

probe of the array is hybridised to an identifier nucleic acid sequence through an 
adapter oFigonucleotide having a sequence complementing the probe as well as one or 
more codons of tiie identifier nucleic acid sequence. 

28. The method according to any of the claims 24 to 27, whereiri a probe of the 
20 array Is capable of hybridising to two codons of the identifier nucleic acid sequence or a^ 

sequence complementary to said sequence. 

29. The metiiod according to claims 24 to 28, wherein a nucleic acid probe of the 
array is capable of hybridising to ail codons of an identifier nucleic acid sequence. 

30. The method according to any of the claims 24 to 29, wherein a nucleic acid 
25 probe Is capable of hybridising to all but one codon of 0ie identifier, or less. 

31 . The metiiod according to any of fhe preceding claims, wherein the existence 
of a hybridisation event is measured tiirough labelling of the identifier nucleic add 
sequence, or an amplification product thereof. 

32. The metiiod according to any of tfie claims 24 to 31, wherein the hybridisation 
30 event is measured by tiie emission of light in a scanner. 

33. The metiiod according to daim 31 or 32, wherein the relative intensity of light 
in each discrete spot is measured. 

34. The metiiod according to claim 24, wherein nucleic acid fragments are primer 
oligonudeotides, and the identification involves subjecting the hybridisation complex 

35 betweein the primer oligonudeotides and the identifier nucleic acid sequences to a 
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condition allowing for an extension reaction to occur when the primer is sufficient 
complementary to a part of the identifier nucleic acid sequence, and evaluating based 
on measurement of the extension reaction, the presence, absence, or relative 
abundance of one or more codons. 
6 35. The method according to claim 34, wherein the condition, which allows for an 

extension reaction to occur, includes a polymerase or a ligase as well as suitable 
substrates. 

36, The method according to claim 35, wherein the condition includes a 
polymerase and a substrate comprising a iDlend of (deoxy)ribonucleotide triphosphates. 
10 37. The method according to any of the claims 34 to 36, wherein at least a f^rt of 

the primer oligonucleotide Is complementaiy to a codon. 

38. The method according to clainis 34 to 37. wherein at least a part of the primer 
oligonucleotide is complementary to a codon and an adjacent framing sequence. 

39. The method according to any of the claims 34 to 38, wherein the sequence 
1 5 comprising the codon and an adjacent framing sequence has a total length of 1 1 

nucleotides or more. 

40. The method according to any of the claims 34 to 39, wherein the extension 
reaction is measured using the polymerase chain reaction (PGR), wherein the primer of 
claim 34 is involved In said PGR. 

20 41 . The method according to any of the claims 34 to 40. wherein a primer is 

labelled. 

42, The method according to claim 41 , wherein the primer is labelled with a small 
molecule, a radioactive component, or a fluorogenic molecule. 

43, The method according to claim 42, wherein the small molecule label is 
25 selected from biotin. dinitrophenol. and digo»genin. and the PGR amplicons are 

detected using an en2yme labelled streptavidin, antMinitrophenol. or anti-digo)dgenln, 
respectively, reporter molecule. 

44, TTie method according to any of the claims 34 to 43. wherein extension 
reaction is measured by feai-time PGR. 

30 45. The method according to daim 44, wherein the real-time PGR involve the 

use of an oligonucleotide probe responsible for the generation of a detectable signal 
during the propagation of the PGR reaction. 

46. The method according to any of the claims 34 to 45, wherein the probe is 
designed to hybridise at a position downstream of a primer binding site. 
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47. The method according to claim .45 or 46, wherein the probe is a 5' nuclease 
oligoprobe or a hairpin oligoprobe. ^ 

48. The method according to claim 24, wherein the nucleic acid fragment is 
associated with a chemical entity precursor capable of being transferred to a recipient 

5. reactive group, 

49. The method according to claim 48, wherein the pool of nucleic acid fragments 
comprise an anti-codon identifying the chemical entity, said anti-codon complementing 
a codon of one or more identifiernudeic acid sequences. 

. 60. The method according to claim 48 or 49, wherein the pool of nucleic acid 
10 fragments further comprises anti>€odons not complemented by codons on an identifier 
nucleic acid sequence. 

. 51 . The method according to any of the claims 48 to 50, wherein the nucleic add 
fragments, each comprising an anti-codon and a chemical entity, hybridised to the 
identifier nucleic acid sequences comprising codons are recovered, 
15 52. The method according or claim 51 . further comprising fonnation of a second 

generation library of complexes, each member of the library comprising an encoded 
molecule and an identifier nudetc add sequence, which codes therefore, using the 
recovered nucleic acid fragments as building blocks. 

53, The method according to any of claims 48 to 52, wherein the identifier nucleic 
20 add sequences of tiie complexes are recovered from the partitioned complexes of step 

iO in daim 1. 

54. The method according to claim 1 , wherein the identifier nucleic acid 
sequences of tiie partitioned library members are amplified prior to the identification 
step- 

25 55. The method according to daim 54, wherein the amplification is conducted 

using the polymerase chain reach'on (PGR). 

56. The metiidd according to any of claims 1 to 55, wherein the identifier nudeic 
add sequences comprising codons are immoliilized during step iiQ. 

57. The metiiod according to 56. wherein, in step Hi), the identifier nucleic acid 
30 sequences are immobifized on a solid phase and ttie pool of nudeic add fragments is 

present in a mobile phase. 

58. The method according to any of claims 1 to 57, wherein the conditions used 
during the contacting step allow for specific hybridisation between nucleic add 
fragments and the identifier nucleic acid sequence comprising codons. 
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59. The method according to any of the claims 51 to 58, wherein the nucleic add 
fragments are recovered using denaturing conditions. 

60. The method according to any of the claim 1 to 59, wherein the second 
generation library Is formed by 

5 a) mixing under hybridisation conditions, nascent bifunctional complexes 

comprising a chemical entity or a reaction product of chemical entities, and an 
identifier nucleic acid sequence comprising codon(s) identifying said chemical 
entities, with the recovered nucleic acid fragments, said fragments comprising an 
oligonucleotide sufficient complementary to at least a part of tine identifier nucleic 

10 acid sequence to allow for hybridisation, a transferable chemical entity and an 

anticodbn identitying the chemfcal entity, to form hybridisation products, 
b) transferring the chemical entities of the nucleic add fragments to ttie nascent 
bifunctional complexes tiirough a reaction involving a reactive jgroup of the 
nascent bifunctional complex, in conjunction with a transfer of tiie genetic 

15 infomnation of theanticodon. 

61 . The metfiod according to claim 60, further comprising step c) separating the 
components of the hybridisation product and recovering tiie complexes. 

62. The metfiod of claim 60 or 61 . wherein steps a) through c) are repeated as 
appropriate using tfie recovered complexes in step c) as the nascent bifunctional 

20 complexes In step a), 

63. The metfiod according to claims 60 to 62. .wherein the genetic infomnation of 
the anticodon is transferred by enzymatically extending tfie identifier nucleic acid 
sequence to obtain a codon attached to the bifunctional complex having received the 
chemical entity. 

25 64. The metfiod according to any of the claims 60 to 63, wherein the genetic 

infomiation of the anticodon is transfen^ to tfie nascent complexes by hybridisation to 
a cognate codon of the identifier region, 

65. The metfiod according to any of tfis claims 60 to 64. wherein the second 
generation library are subjected to a partitioning according to step ii) of claim 1 . 

30 66. The metfiod according to any of tfie claims 1 to 65, wherein, prior to tfie 

partitioning according to claim 65. tfie second generation libraiy of complexes are 
contacted with sequences complementary to tfie identifier nucleic acid sequences, and 
the complexes which have hybridised witii tfie complementary sequences are 
recovered and used In tfie metfiod of claim 60. 
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67. The method according to claim 66, wherein the hybridisation product, prior to 
recovery of the complexes, is treated with an enzyme cleaving in the event a mismatch 
occurs, 

68. The method according to claim 67. wherein the enzyme is selected from T4 
5 endonuclease VII, T4 endpnuclease I, CEL I. nuclease SI. or variants thereof. 

69. The method according to any of the dainris 1 to 68, wherein ttie second- 
generation library is prepared using chemical entities appearing in the initial library and 
chemical entities foreign to the initial library. 

70. The method according to any of the claims 1 to 69, wherein the chemical 

10 entitles used in the formation of the second-generation library occur in a concentration 
above a certain threshold in the partitioned library. 

71. The method according to claim 70, wherein certain chemical entities 
occuning above a certain threshold is excluded in the second or further generation 
library, 

16 72. A composition of molecules with an improved desired property, obtainable 

according to the method of any of the claims 1 to 71 . 

73. A molecule identifiable by subjecting a composition of molecules obtainable 
by a method according to any of the claims 1 to 71 to a condition partitioning members 
having encoded molecules displaying a predetermined property from the remainder of 

20 the composition, and identifying the partitioned encoded molecule{s). 

74. The molecule according to claim 73, wherein the encoded moIecule(s) are 
identified by decoding the identifier nucleic acid sequence. 

75. The molecule according to claims 73 or 74, wherein the composition of 
molecules used in claim 73 is a second or further generation library. 
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